= Mareh 1951 
=Vvel.7 Nol 

JOURNAL OF THE BIOMETRIC SOCIETY 


The Present Status of Variance Component Analysis S. Lee Crump 
Testing a Linear Relation among Variances W. G. Cochran 


Components in Regression John W. Tukey 
Analysis of Variance with Unequal but Proportionate 


Numbers of Observations in the Sub- Classes of 
a Two-Way Classification H. Fairfield Smith 


Consistency of Estimates of Variance Components 
R. E. Comstock & H. F. Robinson 


Use of Components of Variance in Preparing Schedules 


for the Sampling of Baled Wool J. M. Cameron 
Variance Components as a Tool for the Analysis of 

Sample Data Walter A. Hendricks 
Estimating Precision of Textile Instruments John C. Whitwell 


Be 
} | 


{ 
#4 
— 
a 
+ 
+ 


(tt 


TABLE OF CONTENTS 


The Present Status of Variance Component Analysis 
8. Lez Crump 


Testing a Linear Relation among Variances , W. G. CocHRAN 


Components in Regression ........ Joun W. TuKEY 
Analysis of Variance with Unequal but Proportionate Numbers 
of Observations in the Sub-Classes of a Two-Way Classifi- 


H. FarrFreLD SmitH 70 


Consistency of Estimates of Variance Components 


R. E. Comstock H. F. Rospinson 75 
Use of Components of Variance in Preparing Schedules for the 
Sampling of Baled Wool . ....... J. M. CAMERON 83 


Variance Components as a Tool for the Analysis of Sample Data 
Water A. HENDRICKS 97 


Estimating Precision of Textile Instruments, Joan C. WHitwELL = 102 


The Biometric Society .......... 


Number 1 March 1951 Volume 7 } e 


=) 
| The Biometric Socieiy 
FouUNDED BY THE BIOMETRICS SECTION OF THE AMERICAN STATISTICAL ASSOCIATION a 
1 
| 
| 
ig. 


Material for Biometrics should be addressed to the Chairman of the Editorial Board, 
Institute of Statistics, North Carolina State College, Raleigh, N. C.; and material for 
Queries should go to “Queries”, Statistical Laboratory, Iowa State College, Ames, 
Iowa, or to any member of the committee. 


Articles to be considered for publication in Biometrics should be submitted in triplicate. 


THE BIOMETRIC SOCIETY 
General Officers 


President, Arthur Linder; Secretary-Treasurer, C. I. Bliss; Council, Maurice H. Belz, 
Joseph Berkson, William G. Cochran, Georges Darmois, David J. Finney, R. A. 
Fisher, John W. Hopkins, N. K. Jerne, P. C. Mahalanobis, Donald Mainland, 
Kenneth Mather, Margaret Merrell, V. G. Panse, O. E. Sette, P. V. Sukhatme, 
O. Tedin, E. B. Wilson, Frank Yates. 


Regional Officers 


Eastern North American Region: Vice President, H. W. Norton; Secretary-Treasurer, 
Walter T. Federer. British Region: Vice President, R. A. Fisher; Secretary, D. J. 
Finney; Treasurer, A. R. G. Owen. Western North American Region: Vice President, 
G. A. Baker; Secretary-Treasurer, W. C. Rollins. Australasian Region: Vice President, 
C. W. Emmens; Secretary-Treasurer, J. A. Keats. Indian Region: Vice President, 
P. C. Mahalanobis; Secretary, C. Radhakrishna Rao; Treasurer, Anakul Chandra, 
Das. French Region: Vice President, Maurice Frechet; Secretary-Treasurer, Daniel 
Schwartz. 


Editorial Board 
Biometrics 
Chairman, Gertrude M. Cox; Members, C. I. Bliss, E. A. Cornish, W. D. Dixon, 


John W. Fertig, D. J. Finney, O. Kempthorne, A. M. Mood, Horace Norton, 
H. Fairfield Smith, G. W. Snedecor, George Teissier and Jane Worcester. 


The Biometric Society is an international society devoted to the mathematical and statistical 
aspects of biology and welcomes to membership biologists, mathematicians, statisticians and others who 
are interested in its objectives. Through its six regional organizations the Society sponsors regional 
and local meetings. Nationa] secretaries serve the interests of members in Italy, Denmark and the 
Netherlands and there are many members “‘at large’. Dues in the Society for 1951 are as follows: Full 
membership including subscription to Biometrics is $7.00. Members of the Biometrics Section of the 
American Statistical Association who subscribe to the journal through that organization may become 
members of The Biometric Society on the payment of $3.00 annual dues. 


Annual subscription rates to non-members are as follows: For Americar Statistical Association 
Members, $4.00; for subscribers, non-members of either American Statistical Association or The 
Biometric Society, $7.00. Subscription and epplication should be sent to The Biometric Society, 
52 Hillhouse Ave., New Haven, Connecticut, U.S.A. 


Entered as second-class matter at the Post Office at New Haven, Conn., under 
the Act of March 3, 1879. Additional entry at Richmond, Va. and Raleigh, North 
Carolina. Business Office, 52 Hillhouse Ave., New Haven, Conn. Biometrics is pub- 
lished quarterly—in March, June, September and December. 


fad 
: 
— 
— 
. 
: 
+ 
| | 
— 


THE PRESENT STATUS OF VARIANCE 
COMPONENT ANALYSIS 


S. LEE Crump* 


University of Rochester Atomic Energy Project 
I. INTRODUCTION 


‘ie ELEMENTARY THEORY of variance component analysis has been 
discussed in recent papers, by Daniels (10), Crump (8, 9) and 
Eisenhart (11). Since the appearance of the last of these the theory 
has been extended in a number of directions. It seems worthwhile at 
this time then to present a unified discussion of the theory in its present 
state which indicates the available results in order that further work on 
the theory may be directed at the important unsolved problems. 
Although an effort will be made to avoid excessive discussion of ele- 
mentary well known results, enough of these will be included to make 
the general content of the paper essentially self contained and coherent. 

It seems pertinent to begin by defining rather carefully what prob- 
lems are included in the province of variance component analysis. This 
is most easily done by considering a particular type of multiple classi- 
fication which contains all of the essential elements of the situation. It 
must be remembered however that the general features of the next few 
paragraphs are common to any set of data arranged in a multiple 
classification and described by a linear model. Data arranged in a 
two-way classification with the same number of observations in each 
cell will be used to illustrate. Let the classes in the two criteria of 
classification be A, , A. , , A, and B, , B,, --- , B, respectively. 
Then if y,;; denotes the j-th datum in subclass A,B; the complete linear 
model for these data is included in the following equations: 


h=1,2,--+,a 


Yrs = + + 1, 2, , b7. 


j= 1, 2, 


*Presented before the American Statistical A jation annual ting in Chicago, 1950 at sessions 
held jointly by the Biometrics Section of the American Statistical Association and The Biometric 
Society (ENAR). 


1 


2 BIOMETRICS, MARCH 1951 


The assumptions commonly made about the elements in this equa- 
tion fall into the following general framework: the element » is a param- 
eter which remains fixed in repeated samples, while the elements 
{on}, {8:}, {(a@B)a:}, and {«,;;} are random samples of sizes a, b, ab, and 
abn respectively from populations 7, , 7s , Tas , and zm, , each with 
mean zero, respectively. The distribution of the y,,; in repeated samples 
is then further specified by assumptions about the sizes of, and prob- 
ability distributions in, each of these populations. Either Model I or 
Model II of Eisenhart (11) is most commonly assumed. In Model I 
the populations 7, , mz and 7, are of sizes a, b, and ab respectively so 
that the {a,}, {8;} and {(a8),;} exhaust the whole of their respective 
populations in every repeated sample. The population z, is con- 
tinuous with a normal probability distribution and variance o: . Model 
I is the one underlying standard analysis of variance. 

In Model II each of the four populations is continuous with normal 
probability distribution. The variances in these populations are o; , 
os , Tag and o, respectively. Eisenhart (11) has called the case in which 
some of the first three populations fit Model I while the others fit 
Model II, the Mixed Model. 

The excessive restrictiveness of these models has been emphasized 
by Tukey (31, 32, 33) and he proposes Models III, IV and V as follows: 
Model III (Pure randomization) is the same as Model I except that 
the population 7, is also finite, of size abn. In Model IV (Finite popula- 
tions) each of the four populations is finite in size. The sizes are NV, , 
Ns , Nag and N, respectively. Model IV includes Model III as a 
special case when the N-values are a, b, ab and abn respectively. In 
Model V the individual a, , 8; , (@@),; and ¢,;; are regarded as random 
samples of size one from infinite populations with variances o2, , os, , 
o2s,, and o%,,, respectively. Model V includes Model II as a special 
case when each of the populations is normal and the variances listed 
are all independent of the sub-subscripts. Tukey (31) proposes for a 
very general model, Model X in which any element of the original linear 
equation is the sum of three independent subelements, the first satisfying 
Model III, the second Model IV and the third Model V. 

The most general of these models still does not allow non-independ- 
ence between two elements which are not of the same kind. Tukey (32) 
considers the case in which n = 1 (i.e. only one observation in each cell), 
Ta , T and z, as well as the a, , 8; and €,;, satisfy Model IV, but the 
size of ma, is fixed at Nag = N.Ng with a one-to-one correspondence 
existing between pairs of elements in 7, and zz and single elements in 


Tap 


Problems of variance component analysis may be considered to be 


4 
4 
fy 
ia 
4 4 


VARIANCE COMPONENT ANALYSIS 3 


those problems relating to estimating and testing hypotheses about the 
variances in the populations corresponding to the elements of the linear ne 
equations at the bottom of page 1. Most of the discussion of this paper 4 
will be in the framework of Model II. Many of the results stated are 
however valid under Model X with some obvious changes in the mean- 
ings of the notation used in our discussion. We shall endeavor to 
indicate where this validity under Model X holds. We shall not how- 
ever attempt to indicate the precise interpretations of the notation 
which are necessary in this case. We prefer to refer the reader to Tukey 
(31) for a detailed discussion in this regard. Since many of the results 
to which reference will be made involve complex expression... we shall 
frequently only indicate their nature and not write them out explicitly. 


II. BALANCED MULTIPLE CLASSIFICATIONS 


For purposes of this discussion a multiple classification is balanced 
if all of the classes or subclasses of any chosen rank contain the same 
number of observations. The mean squares in the standard analysis 
of variance for any balanced classification may be utilized for esti- 
mating variance components. The method in general is to determine 
the expected values of the mean squares and set each observed mean 
square equal to its expected value. The resulting equations are then 
solved for the variance components and the solution used as the estimate 
of the variance components. This process may be illustrated for the 
two-way classification referred to previously. The analysis of variance 
for such data takes the following form, where the notation for the 
variance components corresponds to that under Model II in the intro- 
duction: 


Source of Degrees of Mean Expected Value of | 
Variation Freedom Square Mean Square a 
Among A-classes a-1l + norag + 
Among B-classes b-1 Mp + norag + 
AB Interaction (a — 1)(b — 1) Map o + noras ; 
Error ab(n — 1) Meg o, 


The expectations of the mean squares are shown in the last column. 
These are valid under Model X.’ The estimate of o3 for example is 


= 1/na{M, Mas}, 


1. The variance of a finite population is defined as 1/(N — 1) {sum of squared deviations from the 
mean} , where N is the population size. 


| 
H 


4 BIOMETRICS, MARCH 1951 


and the other variance components are estimated similarly. The 
estimates obtained in this way are usually called unbiased. It should 
be noted however that the estimating method sometimes used in practice 
is not unbiased, for in that method, if the solution to the equations is 
negative for any component, that component is estimated to be zero. 

The expected mean squares and rules for determining them are given 
for almost any balanced classification in at least one of the following 
sources: Daniels (10), Crump (8, 9), Snedecor (30), Nordskog and 
Crump (24).’ It is not in general difficult to determine the expected 
value of any mean square once the underlying linear equations are 
written and the assumptions about the sizes of the populations stated. 
Tukey (33) gives methods for doing this of wide generality. 

The estimates obtained by equating observed and expected mean 
squares (and replacing negative estimates by zero) are maximum likeli- 
hood estimates when Model II is assumed and the classification is 
balanced. See Crump (9). The properties of this estimating method 
are almost entirely unknown under other assumptions. The sampling 
variances of the estimates have been obtained, however, under very 
general conditions. These will be discussed first for Model II. 

Under Model II any mean square, say M, based on f degrees of 
freedom, in the standard analysis of variance for a balanced multiple 
classification is distributed like 

f 
where E (_ ) denotes mathematical expectation and x; is a chi-square 
variable with f degrees of freedom. Further all the mean squares in 
the analysis are mutually independent. Hence any estimated variance 
component, say 


= 


where M; (i = 1, 2, --- , k) is a mean square based on f,; degrees of 
freedom, has sampling variance 


V(é") 2a;(E(M,)] + 2a,[E(M;)] + 4+ 2a,(E(M,)] 
fi « fo fi 
under Model II. Daniels (10) suggests as an unbiased estimate of V(é”) 
203M? , 202M? 203M? 


1, In references (10) and (24) the convenience of defining the variance in a finite population 
through division by N — 1 is not utilized. 


at 
a 
4 
q 
4 


VARIANCE COMPONENT ANALYSIS 5 


Satterthwaite (27) states that correction for bias is not appropriate and 
suggests the use of f; in place of f; + 2 (¢ = 1, 2, --- , k) in the pre- 
ceding equation. These results make it possible to write out and to 
estimate the sampling variance of any estimated variance component 
in a balanced multiple classification under Model II. 

Many of the estimated variance components in the balanced case are 
of the form 


= 1/k[M, — M,] 


where M, and M, are independent mean squares with f, and f, degrees 
of freedom respectively, and expectations 


L and L + ke’ 


respectively. The sampling distribution of such estimates under Model 
II has been given by Pearson (26) and discussed by Satterthwaite (27) 
and Bhattacharyya (3). It is complex and involves the nuisance 
parameter o°/L. Attempts to construct exact confidence limits for 
estimated variance components have not been successful. Several ap- 
proximate methods have been developed, however. They are discussed 
and illustrated by Bross (4). Two of the methods depend upon ap- 
proximating the distribution of é’, in the one case by a normal distri- 
bution and in the other by a chi-square distribution. 

A third method involves approximating an exact fiducial limit solu- 
tion of the Behrens-Fisher type, while a fourth utilizes the fact that 
exact confidence limits for the ratio o”/L are easily constructed from 
the F-distribution. The accuracy of these approximations is considered 
briefly by Bross (4) and Satterthwaite (27, 28) but more extensive 
investigation of this question is needed. 

The determination of the sampling variance of variance component 
estimates when Model II is not assumed presents some difficulty. 
Hammersley (18) obtained the sampling variances and covariance of 
the estimates for a one-way classification under the assiimption that the 
populations involved are infinite but with arbitrary density function. 
Tukey (33) more recently has given a method of great power and gene- 
rality for evaluating all of the cumulants of estimated variance com- 
ponents in both balanced and unbalanced classifications. Tukey as- 
sumes finite populations throughout and treats the infinite population 
as a limiting case. The method is complex and requires lengthy de- 
velopment so no attempt is made to describe it here. Tukey gives 
explicit general expressions for the sampling variances and covariances 
for the following balanced arrangements: one-way classification with 


6 BIOMETRICS, MARCH 1951 


equal numbers in the subclass, two-way classification with one ob- 
servation per cell, the latin square, and the balanced incomplete block 
arrangement. 

One result obtained by Tukey requires comment here. Tukey finds 
that in any balanced classification the sampling variance of any esti- 
mated variance component except the “error”? component does not 
contain the fourth cumulant of the “error” population. (N.B. Tukey’s 
definition of balance is not the same as that used here, but seems to be 
equivalent. In any event a classification which is balanced in the sense 
of this paper is also balanced in Tukey’s sense). If the arrangement is 
not balanced this is not the case. This state of affairs was noted by 
Hammersley (18) for the one-way classification. Tukey discusses this 
matter in some detail and notes the implication that balanced arrange- 
ments are very desirable when variance components are to be estimated. 
Balance assures that the variance of variance component estimates will 
not be seriously inflated by non-normality in the experimental errors. 

Returning now to Model II, consider the problem of testing the 
hypothesis that c* = 0 when two independent mean squares, M, and 
M, , of the form defined previously are available. On the null hypoth- 
esis that o° = 0 the ratio M,/M, is distributed like ordinary F with 
f. and f, degrees of freedom. Hence the usual F-test in the analysis of 
variance under Model I can be used to test the hypothesis that o” = 0 
under Model II. The significance level for the test is the same under 
either model, but the power functions differ. The power function of 
the test under Model II is seen to iake a relatively simple form when 
it is noted that the ratio 


1 
M, E(M,) _ Mz 


M, E(M.) M, 


is distributed like ordinary F with f, and f, degrees of freedom. Thus 
the probability that the ratio M.,/M, shall exceed any value, say F-.«) , 
for given \ = ko"/L, may be obtained from tables of the F-distribution 
or of the Incomplete B-function. The details of finding this probability 
are given by Johnson (22) and Patnaik (25). Fairly extensive tables 
of this power function are being prepared at the University of Rochester 
Atomic Energy Project. 

Some brief tables which relate specifically to the balanced one-way 
classification have been presented by Johnson (22) and Baines (1). If 
the one-way classification has classes A, , A. , --- , A. , each containing 
n observations, the correspondence between the general notation and a 
specific one for this particular case is as follows: 


the 
Kf 
4 
de 
eh 
| 


VARIANCE COMPONENT ANALYSIS 7 


Notation for balanced 
General Notation one-way classification 
M, “within groups” mean square 
M, “among groups’? mean square 
= L 
E(M2) = L + ko? + nor, 
f, a(n — 1) 
fr 


Johnson points out that there is a minimum value for a such that the 
power of the test may be made less than a, , say, when o2/o. < Q, and 
greater than 1 — a , say, when o3/o. > Q, , with proper choice of n. 
This minimum value depends only on the ratio Q,/Q, . A short table 
of these minimum values for a with a, = a, = .05 and a, = a, = .01, 
and for Q,/Q, = 1.0, 1.5, 2.0, 2.5, 3.0 is given. The required values 
for n are not given. 

Baines gives a table of the values of Q = o2/o% for which the power 
of the test conducted at a significance level of 5% is equal to 1/2, for 
selected values of a from 2 to 31 and of n from 2 to 121. In the report 
cited and in another (2), Baines presents an excellent practical dis- 
cussion of the determination of sample size for experiments designed to 
estimate variance components. 

Patnaik (25) plots the power curves for several pairs of values of a 
and n and notes that for the selected cases this power curve (for Model 
II) lies below the corresponding one for Model I. He conjectures that 
this relation is true in general.’ 

When it is desired to test the null hypothesis that a variance com- 
ponent, o, is equal to zero and there are not available independent 
mean squares whose expectations are of the form of Z(M,) and E(M,) 
utilized before no entirely satisfactory test has been devised. In this 
case there will be a mean square M, whose expectation is L + ko’ but 
the only quantity whose expectation is L will be a linear combination 
of other mean squares, say, 


M = + + a,M,. 


As noted previously, M is not distributed as x’ and the ratio M,/M is 


1. Professor Tukey has suggested to me that the Student-Fisher discussion of systematic vs. 
random experiments supplies a case when the power functions for Model I and Model II criss-cross. 
I have net had an opportunity to pursue this suggestion. 


AS 
| 


8 BIOMETRICS, MARCH 1951 


not distributed like ordinary F on the null hypothesis." The most 
useful approach to this problem available at this time is Satterthwaite’s 
(27, 28) where the device of fitting a chi-square distribution to the 
distribution of M is used.” The effective degrees of freedom in the 
approximating chi-square distribution are estimated by 


M 
eee 


(a,M 
fs 


and the ratio M,/M referred to the ordinary F-distribution with f, 
and f degrees of freedom. Very little is known about the accuracy of this 
approximation, but to date it is the only practical one available. 


III. UNBALANCED CLASSIFICATIONS 


The unbalanced arrangement which has received nearly all of the 
attention to date is the one-way classification with unequal numbers of 
observations in the groups. Since the problems of variance component 
analysis in the unbalanced case are considerably more complex than in 
the balanced case, and are less adequately solved, the unbalanced one- 
way classification is here discussed in some detail. Let the groups again 
be denoted A, , A,, --- , A. , the h-th group containing n, observations. 
If y,; is the 7-th datum in the h-th group, the linear model for this case 
is contained in the following equations: 


h=1,2,---,a 
Yr 
a=1,2,---,m 


where the {a,} and {¢,;} are random samples of size a and n. = }-}_, m, 
from »opulations 7, and x, with variances o2 and o% , respectively. 
The standard analysis of variance for such data takes the form 


Source of Variation | Degrees of Freedom Mean Square 
Among Groups [1/(a — 1)] Dhar — = Ma 
Within Groups n.—a [1/(n. — a)] Zas (yas — Gr.)? = Mw 


1. Wilm (37) attacks this problem by considering the ratio M/Mi , where E(Mi) = L’, and M 
is a linear combination of mean squares such that #(M) = L’ + ko*. The problems in Wilm’s attack 
are similar to those mertiored here. 

2. The paper by W. G. Cochran, ‘‘Testing a linear relation among variances”, which appears in this 
issue deals with a generalization of this approach. 


ag 
tte 

a 

ies 

: 

| 
a 
ix 


VARIANCE COMPONENT ANALYSIS 9 


where 
nh 
mr. = h=1,2,-:-,a and = 
i=1 n=1 


Under the assumptions of Model X the two mean squares have 
expected values 


E(M w) 


This result seems to have been given first by Cochran (7) and is derived | - 
explicitly in several places (1, 18, 38). The estimates of the variance 
components that suggest themselves are 


= My 


1 
ng == My). 


The sampling variances and covariances of these estimates under Model 
IV are given by Tukey (33). The same results are given by Hammersley 
(18) for the limiting case of Model IV in which all the populations are 
infinite is assumed. 

Crump (9) has investigated this case under Model II, i.e. 7, and 
7, normal. My is still distributed like x’? under Model II, but M, , 
though independent of My , is not so distributed in the unbalanced 
arrangement (except when o2 = 0) and the sampling variances of the 
estimates no longer take the simple form for the balanced arrangements. 
They turn out to be 


2o% 
n.—a 


Wh n. Wi, @ 
where | 
| 
Nr 
and w, = 
O~ 1 + Qn. 


The estimates 67 and ¢2 are no longer the maximum likelihood esti- 
mates as they were in the balanced case. It should be noted that the 
general mean, y.. , is also not the maximum likelihood estimate of u 


ie 
| 
2\2 2 3 Bee 


10 : BIOMETRICS, MARCH 1951 


under Model II in this and other unbalanced arrangements. See Coch- 
ran (5). The maximum likelihood equations are developed in reference 
(9), and an approximate solution is given there. No explicit solution 
is possible however and the exact sampling variances have not been 
obtained for comparison with V(é?) and V(éz). The large sample re- 
sults are given by Crump (9) and denoting by V(¢:) and V(@2) the large 
sample variances of the maximum likelihood estimates the following 
results are obtained: 


_ { a [ 


V(e2) 
V(é2) 


-a+)> (2) —1)P 


n.— 


The first expression above may be written 
1 


n—1 


where an = n., and 0 < @ < 1, hence the estimate é% has the same 
variance as o, as n © independently of a. 

The second ratio is so complex that its behavior has not been 
successfully studied. For fixed values of Q = o3/o% and a, the limit of 
the ratio as n becomes large is less than 


1— 1/a 


It appears that the estimate ¢2 has low relative efficiency when the 
number of groups is small. Crump (9) suggests that in the case where 
n is large the substitution of ¢2 = % into the maximum likelihood 
equations may lead to improved estimates of ¢3 . Tukey (33) con- 


siders the following estimates for the case in point: 


@=M, 


and determines the sampling variances of these estimates under Model 


on — 
in 
| 


VARIANCE COMPONENT ANALYSIS 11 


IV as well as under Model II. Comparisons among V(é2), V(é2) and 
V(e2) have not yet been made. 

The only practical method for placing confidence limits on «2 seems 
to be that of using normal approximations to the distributions of ¢2 , 
62 or ¢2 . Wald (34) gives an exact method for confidence limits on 
the ratio Q. 

Very little useful progress has been made in the treatment of two- 
way classifications with unequal numbers in the subclasses. Even under 
Model I a number of different methods of analysis are in common use. 
Chief among these are the so-called methods of fitting constants, ex- 
pected subclass numbers, unweighted squares of means, and weighted 
squares of means. See Yates (39) and Snedecor (30). Whatever the 
method of analysis, estimates of the variance components may be ob- 
tained by the device of equating observed and expected mean squares. 
The expected mean squares for the method of expected subclass numbers 
and the method of unweighted squares of means are given by Crump 
(9), while those for the method of weighted squares of means are given 
by Federer (12). Henderson (19) and Federer (12) give the expected 
mean squares for the method of fitting constants. Since it is expected 
that these results will be published and since they are lengthy they are 
not reproduced here. For the first two methods above Crump has 
developed the sampling variances of the estimates. These are extremely 
complex and it has not been possible to make any general comparisons 
between the two methods. Since the method of vnweighted squares of 
means is simplest computationally and since there is no assurance yet 
that one of the other methods is “better’’, it seems to be the indicated 
method. 

Lucas (23) and Henderson (20) have given some consideration to 
the estimation of variance components when the data are analysed by 
the method of fitting constants. Lucas has developed a computational 
scheme for computing the estimates, within the framework of the ab- 
breviated Doolittle method for solving the normal equations. The 
details of this method are not yet published, and the sampling variances 
of the estimates are not yet derived. It is to be hoped that application 
of Tukey’s (33) methods will produce some results along these lines. 

Wald (35, 36) has extended the method referred to in regard to the 
unbalanced one-way classification so that exact confidence limits for 
the ratio of any variance component to the error component may be 
obtained in any kind of multiple classification. When the estimate of 
the error variance is based on a large number of degrees of freedom, 
Wald’s method may be modified to give approximate confidence limits 
on any variance component. See Bross (4). 


| 
fy 


12 BIOMETRICS, MARCH 1951 


Ganguli (16), Hetzer et al. (21) and Finkner et al. (13) give the ex- 
pected mean squares in the standard analysis of variance for unbal- 
anced “‘nested’’ classifications of any order, but these results have been 
pushed no further. 


IV. MISCELLANEOUS TOPICS 


In this section a few isolated results are mentioned which do not fit 
easily into sections II and III. In addition there are some general 
remarks and suggestions. 

Cochran (6) and Crump (9) have considered briefly the variance 
component problem in a balanced one-way classification with a co- 
variate, x. Using the same notation as previously for the one-way 
classification, the linear model for this case is 


h=1,2,-::,a@ 
Yrs = Bt on t+ + 


where the x’s are known constants, 8 the unknown regression coefficient, 
and the remaining elements have the same meaning as before. The 
standard analysis of covariance for these data may be shown sym- 
bolically as follows: 


Errors of Estimate 
Source of | Degreesof| Sums of 
Variation Freedom | Squares and Sums of Mean 
Products Squares D.F. Squares 

Total an —1 | Tyy Tey Tez | Tyy — 
AmongGroups| a@—1 | Ay, Azy Azz | Ayy — Azy?/Azz a-—-2 Ma, 
Within Groups an — 1) Wyy Way Wes Way W2,?/Waz an-— 1) 1 My 
Difference for | Tyy — Wyy — T2y?/Tzz 


The estimate o% is just 


= M Ww 
again, with sampling variance 
20% 
a(n — 1) — 1 


For o2 two estimates are proposed 


pat 
4 
4 
= 
at 
wake 
2 
| 
| 


VARIANCE COMPONENT ANALYSIS 13 


=1/n(M,— My), or 
= (M', — My) 
* na — 2+ 7) 
where 
r= 


The ratio of the sampling variances of these two estimates turns out to 
be, under Model IT 


(1 + Qn)’ 1 2 
{ a—2 


V(é") 


AC) 7 { 2 2 (a — 1)’ \ 

(1 + — 2) + + + 
which is > 1 forr = 1 and < 1 forr = 0. Hence, on the basis of their 
sampling variances, sometimes ¢2 is the better while at other times 62 
is the better. The point 7, at which the ratio is unity is a complicated 
function of Q, aand n. A linear combination with varying weights is 
indicated, but is not yet worked out. The sampling variances of both 
estimates approach that of the maximum likelihood estimate in large 
samples. 

Questions are frequently raised about the testing of hypotheses in 
cases where the model is a mixture of Model I and Model II. Johnson 
(22) discusses the balanced two-way classification mentioned in the 
introduction from this standpoint. Suppose that Model I applies to 
the {a,} and {8@;}, ie. that 7, and zz are of sizes a and b respectively 
and it is desired to test the hypotheses that a, = a2 = --- =a, = 0. 
Now in the case that 7, is also assumed finite of size ab the expectation 
of M A is 


E(M,) = Dal, 


while in the case that 7,, is normal 


The expectations of the interaction mean square in the two cases re- 
spectively are 


2 n 2 
E(M as) =o.+ (a 1)(b 1) z= (a8) ns 


| 


14 BIOMETRICS, MARCH 1951 


and 
E(M 4s) + NO 
while the error mean square, M, , has expectation 
2 


in either case. It appears then that the test of the stated hypothesis 
should be made in the first case by referring the ratio 


M,/Me 
to F-table, while in the second case the ratio 
M,/Mas 


is the appropriate one. Which is the appropriate test ratio then depends 
upon whether the interaction effects are regarded as systematic and 
constant over repetitions of the experiment, or as independent normal 
variables over repetitions. The question of the two tests was pointed 
out by Fisher (15) very early. Johnson appears to be the first to have 
considered it explicitly in terms of variance components. A complete 
discussion of this and similar questions with a consideration of the 
practical circumstances under which each model is appropriate would 
be most welcome. 

It appears to the writer that Model IT is frequently assumed, e.g. in 
the balanced two-way classification, when it is quite inappropriate. 
Neglecting even the question of normality, it seems that the assumption ~ 
of independence between the samples {a,} and {8;} and the sample 
{(a8),;} is frequently not justified. The terms (a@),; are introduced 
into the equation to compensate for the lack of strict additivity in the 
main effects. Although it may be very reasonable to regard the A- 
classes and B-classes as random samples from meaningful populations, 
more often than not the (a8),; will be at least partially determined by 
the particular a’s and #’s which have been drawn. This situation also 
seems to deserve more attention than it has yet received. See Tukey 
(32). 

Henderson (20) has considered the problem of estimating jointly the 
variance components and the values of the classification effects in the 
particular sample at hand. Even in the simplest case of the balanced 
one-way classification the maximum likelihood equations under Model 
II for this joint estimation are formidable. The chief point to be noted 
here is that the maximum likelihood estimates of the variance com- 
ponents even in the balanced case are not those obtained from the 


simple analysis of variance when the joint estimation problem is con- 
sidered. 


VE 
4 
hi 
44 
J 


VARIANCE COMPONENT ANALYSIS 15 


Reference should also be made to work of Grubbs (17) and Smith 
(29) which falls within the province of variance component analysis. 
This work is concerned with the problem of estimating separately 
product variability and instrument precision when it is not possible to 
measure the same item twice with the same instrument, but is possible 
to measure the same item with two or more instruments. The author 
regrets that he has not had an opportunity to study this material 
sufficiently well to include more than mention of it here. 

Hope of further progress in the theory of variance component 
analysis seems to be along the following lines: 

1. Tukey’s (33) developments open a wide range of possibilities for 
very complex arrangements and for non-normal distributions. 

2. Wald’s (36) approach to the particular problem of confidence 
limits on the ratios of variance components is very general within the 
framework of Model II. It might well serve as a model for other work 
directed toward other problems within Model II. 


REFERENCES 


1. Baines, A. H. J. On the allocation of observations in certain types of factorial 
experiment when the number of observations is limited. Ministry of Supply 
Advisory Service on Statistical Method and Quality Control. Tech. Rep. Ne. 
Q.C./R/4. 1943. 

2. ————On the economical designs of statistical experiments. [bid Tech. Rep. No. 
Q.C./R/15. 1944. 

3. Bhattacharyya, A. A note on the distribution of chi-squares. Sankhya 7:27-28. 
1945. 

4. Bross, Irwin. Fiducial intervals for variance components. Biometrics 6:136-144. 
1950. 

5. Cochran, W. G. Problems arising in the analysis of a series of similar experiments. 
J. Roy. Stat. Soc. Supp. 4:102-118. 1937. 


6. Analysis of covariance. A paper presented to the Institute of Mathe- 
matical Statistics at Princeton University, November 1, 1946. 
 - The use of analysis of variance in enumeration by sampling. J. Am. Stat. 


Assoc. 34:492-510. 1939. 
8. Crump, S. L. The estimation of variance components in analysis of variance. 
Biom. Bull. 2:7-11. 1946. 

The estimation of components of variance in multiple classifications. 
Unpublished Ph.D. thesis, lowa State College Library, Ames, Iowa. 1947. 

10. Daniels, H. E. The estimation of components of variance. J. Roy. Stat. Soc. Supp. 
6:186-197. 1939. 

11. Eisenhart, Churchill. The assumptions underlying analysis of variance. Bio- 
metrics 3:1-21. 1947. 

12. Federer, W. T. Evaluation of variance components from a group of experiments 
with multiple classifications. Unpublished Ph.D. thesis, Iowa State College 
Library, Ames, Iowa. 1948. 

13. Finkner, A. L., Morgan, J. J., and Monroe, R. J. Methods of estimating farm 
employment from sample data in North Carolina. N. C. State Agric. Expt. Sta. 
Tech. Bull. No. 75. 1943. 


{ 
: 


BIOMETRICS, MARCH 1951 


. Fisher, R. A. The fiducial argument in statistical inference. Ann. Eugen. 6: 


391-398. 1935. 
The Design of Experiments. London: Oliver & Boyd. 1935. 


. Ganguli, M. A note on nested sampling. Sankhya 5: 449-452. 1941. 
. Grubbs, F. E. On estimating the precision of measuring instruments and product 


variability. J. Am. Stat. Assoc. 43:243-264. 1948. 


. Hammersley, J. M. The unbiased estimate and standard error of the interclass 


variance. Metron 15:189-205. 1949. 


. Henderson, C. R. Unpublished thesis, Iowa State College Library. 1948. (Title 


not known). 


. Henderson, C. R. Estimation of genetic parameters. Abstract. Ann. Math. Stat. 


21:308. 1950. 


. Hetzer, H. O., Dickerson, G. E., and Zeller, J. H. Heritability of type in Poland 


China swine as evaluated by scoring. J. Animal Sci. 3:390-398. 1944. 


. Johnson, N. L. Alternative systems in the analysis of variance. Biometrika 


35 :80-87. 1948. 


. Lucas, H. L. A method of estimating components of variance in disproportionate 


numbers. Abstract. Ann. Math. Stat. 21:302. 1950. 


. Nordskog, A. W. and Crump, S. L. Systematic and random sampling for estimat- 


ing egg production in poultry Biometrics. 4:223-233. 1948. 


. Patnaik, P. B. The non-central x?- and F-distributions and their applications. 


Biometrika. 36:202-232. 1949. 


. Pearson, K. On the application of the double Bessel function K,,;,(x) to sta- 


tistical problems. Biometrika 25:158-217. 1933. 


. Satterthwaite, F. E. Synthesis of Variance. Psychometrika 6:309-316. 1941. 


An approximate distribution of estimates of variance components. Bio- 
metrics 2:110-114. 1946. 


. Smith, H. F. Estimating precision of measuring imstrimments. J. Am. Stat. Assoc. 


45:447-451. 1950. 


20. Snedecor, G. W. Statistical Methods, 284-301. Iowa State College Press. 1946. 
31. Tukey, J. W. Dyadicanova, an analysis of variance for vectors. Human Biol. 


39. 


21:65-110. 1949. 

Interaction in a row-by-column design. Memorandum report 18 of the 
Statistical Research Group, Princeton Univ. 1949. 

Finite sampling simplified! Memorandum report 45 of the Statistical 
Research Group, Princeton Univ. 1950. 


. Wald, A. A note on analysis of variance with unequal class frequencies. Ann. 


Math. Stat. 11:96-100. 1940. 

On the analysis of variance in case of multiple classifications with unequal 
class frequencies. Ibid. 12:346-349. 1941. 

A note on regression analysis. Jbid. 18:586-588. 1947. 


. Wilm, H. G. Notes on analysis of experiments replicated in time. Biometrics 


1:16-20. 1945. 


. Winsor, C. P. and Clarke, G. L. A statistical study of variation in the catch of 


plankton nets. J. Marine Res. 3:1-34. 1940. 
Yates, F. The analysis of multiple classifications with unequal numbers in the 
different classes. J. Am. Stat. Assoc. 29:51-66. 1934. 


1. Part of this report was published in the December issue of J, Am. Stat. Assoc. under the title, 
“Some Sampling Simplified’’. 


“ih. 4 
15. — 
: 16 
17 
20 
23 
25 
26 
27 
™ 
32. 
33. 
35. 
36. — 


TESTING A LINEAR RELATION AMONG VARIANCES* 
W. G. Cocuran** 


School of Hygiene and Public Health, 
Johns Hopkins University. 


1. NATURE OF THE PROBLEM 


W: HAVE A NUMBER of independent estimates v; of variances 6; 
respectively, (i = 1, 2, --- k). The estimate v,; is based on n; 
degrees of freedom (d.f.) and follows the usual distribution of a mean 
square derived from normally distributed observations; namely that 
n,v;/0; is x; with n; d.f. We wish to test the null hypothesis that p 
homogeneous linear relations 


(1) C130, + C2502 + =O = 1,2, p) 


hold among the 6; , where the c;; are known numbers. 

This problem, usually involving a single linear relation, has been 
encountered occasionally in the applications of statistical methods to 
data. With the current increased interest in the “components of 
variance” technique, it is not unlikely that the problem will appear 
more frequently in the future. Two examples will be described briefly. 

An experiment is repeated at r different places on each of c different 
occasions, because it is expected that the performance of the treat- 
ments (é) will differ from one place or one time to another. The places 
and times are assumed to be a random sample of the population of 
places and times in which the results will be used. If the usual “com- 
ponents of variance” model is set up, the expectations of the four 
principal items in the analysis of variance of the results are as shown 
in Table I. 


*Prepared in connection with research sponsored by the Office of Naval Research. 

**Presented before the American Statistical Association annual meeting in Chicago, 1950 at 
sessions held jointly by the Biometrics Section of the American Statistical Association and The Bio- 
metric Society (ENAR). Department of Biostatistics paper no. 266. 


17 


: 
} 


BIOMETRICS, MARCH 1951 


TABLE I. EXPECTATIONS OF MEAN SQUARES. 


mM. 8. Expected value 


Treatments| |@ = (EZ) + (TPO) + 7(T0O) + c(TP) + re(Q(r; — 7)*)/(t 1) 
Treatments 


X places v2 |@. = (E) + (TPO) + ¢e(TP) 
Treatments 

X occasions} v3 |6; = + (TPO) + r(TO) 
Treatments 

X places 


X occasions} |@; = (EZ) + (TPO) 


In this representation, all effects except the treatment means 7; are 
assumed to be random variables. The variance (7PO) stands for the 
variance contributed by the three-factor interaction, and so on. For 
more detailed discussion of this model, see (1) and (2). 

Suppose that we wish to test the hypothesis that there are no differ- 
ences in the average effects of the treatments; i.e. that all 7; are equal. 
It is evident that none of the other lines in the analysis of variance 
supplies an appropriate denominator, or ‘error’? mean square, for an 
F-test of the treatments mean square. In fact, the null hypothesis that 
all +; are equal implies the linear relation 


6, — 0. — 0, + 6, = 0. 


Consequently a test of the null hypothesis is a test of this linear relation. 

A second example occurred in a corn-breeding experiment in which 
mn female parents from a population produced by random mating were 
mated n to each of m males from the same population. Comstock and 
Robinson (3) were able to show that under certain assumptions the 
expectations of three mean squares in the analysis of variance of the 
results were connected by the relation 


€,0, + = 45 


In this case the coefficients c, and c, are functions of a quantity a 
which serves as a measure of the degree of dominance, the value a = 0 
implying no dominance, while values greater than one imply over- 
dominance. Thus the null hypothesis that a has a specified value leads 
to a linear relation with known coefficients, which we may wish to test 
from the data. 


th 
= 4 
4 
4 


LINEAR RELATION AMONG VARIANCES 19 


2. RELATION TO PREVIOUS WORK. 


Three tests which are already well-known are particular cases of the 
general problem of testing relations (1). 
(i) The F-test. This tests the relation 0, = @,. It is an exact test. 
(ii) Bartlett’s test of homogeneity of variances. This tests the series of 
(k — 1) relations 0, = 6. = --- = 6. For this Bartlett (4) proposed 
a form of the likelihood-ratio test criterion 


ni) log. — (n; log, 


which is distributed approximately as x” with (k — 1) d.f. Modifica- 
tions have been suggested by Bartlett and others in order to improve 
the approximation to the tabular x’ distribution. 

(iti) The two-tailed Behrens-Fisher problem. If x, is a normally-dis- 
tributed estimate of 4, with variance 6, , with similar roles for x2 , ue 
and @, , the Behrens-Fisher problem is concerned with testing the 
hypothesis = . We possess estimates v, , v2 of 0, , respectively, 
but nothing is known about the relative sizes of 0, and 6, . 

The test criterion is (7, — x2)/~/v,; +2. If, however, the test is 
to be two-tailed, we might equally use (x, — 2)” as the basis of a test 
criterion. When the null hypothesis holds, (x, — 22) is an estimate of 
(0, + @,), based on 1 d.f. Thus the problem may be regarded as one 
of testing the relation 


= 0 


where the estimate (x, — 22)” of @; happens to be derived from only 
1 df. 

This case illustrates the difficulty of testing a linear relation among 
variances, even when only three variances are involved. The unknown 
ratio 6,/0. tends to creep into the solution as a nuisance parameter, 
and some way must be found to get rid of it. Two approaches may be 
mentioned. Fisher (5) developed a test in which the ratio v,/v. remains 
fixed from sample to sample, while the ratio 0,/@, varies about v,/v2 in 
its fiducial distribution. Tables have been constructed by Sukhatme 
(6). On the other hand, Welch (7) considered a popuiation in which 
6,/@, remains fixed. By successive approximation he obtained a func- 


tion h, depending on v,/v, and on the significance probability P, such 
that 


Pr. {(% — 2) > h(vi/v2 , P)} = P, 


for any value of 6,/6,. Aspen (8) has published tables from which the 
test may be made. 


} 
4 
| 
| 
| 
| 
| 


20 BIOMETRICS, MARCH 1951 


3. CONTENT OF THIS PAPER. 


The subsequent discussion falls into three main parts. An approxi- 
mate F-test of a single relation will be presented and illustrated. Rela- 
tively little is known about the closeness of this approximation, but as 
the test is being used in applications, it seems advisable to put some 
account of it on record. The test is easy to make, and is recommended, 
at least until a more exact test may appear. In a later section an 
investigation of the adequacy of the approximation and of the power 
of the test is given for the case where there are only three variances. 
Finally, in an Appendix, the large-sample limiting distribution of the 
test is compared with those of other tests which suggest themselves, 
particularly the likelihood-ratio test. 


4. THE APPROXIMATE F-TEST. 


Since the coefficients c;; in the relation are known numbers, the 
linear relation can always be reduced to the form 


If the alternative hypothesis specifies that one side of the equation is 
definitely greater than the other, we suppose that the left side is the 
greater. The test criterion suggested is 


F’ = + --> +2, 


When the null hypothesis holds, this quantity follows the F-distribu- 
tion, approximately. The degrees of freedom », , v2 are found by a rule 
suggested by Fairfield Smith (9) and Satterthwaite (10). 


Ne n, Nr+1 Ny 


The values of », , v2 are not in general integers. Interpolation in 
the F-tables is rarely necessary, since a glance at the F-value for the 
nearest integers to v, and v, usually decides the issue. 

If the alternative hypothesis is two-sided, we place in the numerator 
of F’ whichever of the estimates of variance happens to be larger, so 
that F’ > 1. The resulting probability is doubled. 

Many different F-ratios could be formed for an approximate test of 
relation (2). Consider the first example described previously, where the 
null hypothesis is (0, — 6. — 0; + 6, = 0). Under the alternative 
hypothesis, the values of 6. , 6; and 6, remain unchanged but that of 
6, increases. This suggests that an F-ratio of the form v,/(v2 + v3 — v%), 


34 
° 


LINEAR RELATION AMONG VARIANCES 21 


instead of (v; + v4)/(¥2 + v3), might be appropriate, and some workers 
have used this form. In the case where all d.f. n; become large, the 
limiting power functions of the two test criteria are the same. The 
form proposed was suggested intuitively on the grounds that a quantity 
like (v. + v3 — 04), Where some coefficients are negative, is not so well 
represented by a Type III approximation as a linear form where all 
coefficients are positive. Consequently, the proposed F’ may be dis- 
tributed more like F,, in samples of practical size, than the alternative 
criterion. 


5. NUMERICAL EXAMPLE. 
This is a test of the relation 
6,+ 6, = + 4 


The data come from a long-term experiment on sugar-beet conducted 
by Rothamsted Experimental Station. Since the data are intended only 
to illustrate the arithmetic involved in making the test, explanatory 
details are omitted. A partial analysis of variance is shown in Table IT. 


TABLE II. PARTIAL ANALYSIS OF VARIANCE (ROOTS, TONS PER ACRE). 


m.8. Estimate of 
Treatments 3 | 32,489 6 = (E.) + 5(TY) + 4(E,) + 20(T) 
Error (a) 8 2,791 6, = (E.) + 4(E,) 
Treatments Xyears|} 9 1,527 06; = (E.) + 5(TY) 
Error (b) 24 317 0, = (E.) 


The null hypothesis is that the component (7) is zero. This is 
equivalent to 0, + 0, = 0. + 43 with a one-sided alternative. Hence 


_ 32,489 + 317 _ 32,806 


= 3701 + 1,597 4318 ~ 
_ (32,806)? 1 
(32,489)? (317)? (.9903) (.0097)2 
3 24 3 24 
™ (2,791)° 4 (1,527)°  (.6464) 
8 9 8 9 


The 1% value of F for 3, 15 df. is 5.42. The observed F’ is definitely 
significant. 


» 
ey 
ie 
4 
7 


22 BIOMETRICS, MARCH 1951 
6. INVESTIGATION OF THE F’ DISTRIBUTION. 


This investigation is confined to the case of three variances, where 
the null hypothesis 6; = 6, + 6, holds. The test criterion is F’ = 
v;/(v, + v2). The quantity computed was P(F’ > F’,;); that is, the 
true probability that F’ exceeds the 5 percent level attributed to it by 
the approximation used. 

The approximation is the tabular F-distribution, with », , v2 df. 
where vy, = 7; and 


(3) 


Ne nN Ne 


with wu = v,/v,. Note that this distribution depends on the variance- 
ratio u, so that we do not use a single approximation to the exact dis- 
tribution of F’, but a whole series of approximations, depending on the 
u which happens to turn up. Consequently, 


(4) POP! > Fras) = > du 


where f(u) is the frequency function of u. The distributions of F’ and 
u are not independent, except in certain special cases, so that in (4) 


we must use the conditional distribution of F’ given u. This may be 


obtained by routine methods. 


An alternative approach gives the conditional distribution more 
quickly. It is well known that 


follows the F-distribution with (n; , n; + m2) d.f. and is distributed 


independently of u = v,/v. . Hence the conditional distribution of Q 
given wu is the same F-distribution. But, since 0; = 0, + 0, 


(5) 


+ 
(1+ uj(m +m) ’ 
where U = 0,/0.. Thus F’ is the product of Q and a factor which 


5 
| 
v3(% + M2) 
1 2 
“Ae 
4 


LINEAR RELATION AMONG VARIANCES 23 


depends solely on wu and the fixed parameter U. Hence the conditional 
distribution of F’ is transformed to an ordinary F-distribution, with 
(ns , Mm, + n2) d.f., by multiplying by the inverse of the factor. 

These results give a method for numerical evaluation of (4). The 
percentiles of the distribution of u can be read from the F-table, since 
u/U follows the F-distribution with n, , n. df. For any value of u, 
F’,; is found by means of equation (3) and a further reference to the 
F-table with », , v. d.f. Then the conditional probability that F’ exceeds 
F’,; is read by transforming F’ to an F-distribution with (n3 , 2; + me) 
d.f. from (5). Finally, the integral is evaluated numerically in the form 


(6) P(F’ > F's) = > Flos/p) dp, 


where p is the cumulative probability obtained from u so that dp = 
“f(u) du. 

Computations were made for six sets of values of the n; , ranging 
from n, = n. = 6, n3 = 3; to nm; = Nn. = 24, ns = 6. These values 
were thought to be fairly representative of the numbers of d.f. available 
in some of the smaller applications. Four ratios for 6,/8, were included; 
1,2,4and 16. The true significance probabilities are shown in Table III. . 


TABLE III. TRUE SIGNIFICANCE PROBABILITY OF 
F’ AT THE APPARENT 5% LEVEL. 


U = 6/62 
m,n = = 
1 2 4 16 
6 3 .044 .047 .051 .056 
6 .043 .046 .053 .059 
12 6 .047 .049 .052 .054 
12 .046 .049 .054 .056 
24 6 .050 .050 .052 .053 
12 .049 .050 .052 .054 


The primary purpose of the F’ test, with its varying significance 
levels according to the value of v,/v. , is to nullify the effect of the 
nuisance parameter U = 6,/6,. If the approximation were fully suc- 
cessful, all entries in Table III would be 0.050. Looking along the 
rows, we see that the effect of variations in U is not obliterated. When 
U = 1, the F’ test gives in general slightly too few significant results. 


\ 
bd 
|| 


TABLE IV. PROBABILITY OF OBTAINING A SIGNIFICANT RESULT 
AT THE APPARENT 5% LEVEL OF F’. 


BIOMETRICS, MARCH 1951 


= + 42) 


2 4 
F(6, 12) 259 .622 
(.26)* .592 (.62) 
F’:U (.24) .583 (.59) 
(.23) .556 (.57) 
.213 (.22) -504 (.53) 
F(6, 6) . 188 -468 


*Figures in ( ) denote an approximation discussed later. 


As U departs from 1, the proportion of significant results increases to 
over 1 in 20. Throughout the range of variation of U, however, the 
probability is always near enough to 0.05 so that the approximation 
seems adequate for practical use. 

Only a few values were calculated for U > 16, since this case may 
The significance probabilities appear to increase 
to a maximum which is not far from the probabilities for U = 16. As 
U increases still further, the probabilities decline towards 0.05, which 
is the limiting value when U =~. 

The effect of n; (number of d.f. in the numerator of the test) is seen 
by comparing neighboring pairs of rows. 
proximation is slightly better with the lower than with the higher of 
As n, and n, increase, on the other hand, the 
approximation tends to improve, as would be expected. 

This case provides little guidance as to the performance of the test 
when more than three variances are involved. The methods employed 
can be extended to the case of four variances, but the calculations are 
‘We would expect the approximation to be less 
satisfactory with four variances, since two nuisance parameters are 
jnvolved. 


be rare in practice. 


the two values of 7; . 


much more lengthy. 


If ¢ 


7. THE POWER FUNCTION OF THE F’ TEST. 


6;/(6, + 62), equation (5) still holds for the conditional 
distribution of F’ given u, except that the right side is multiplied by ¢. 
For any known value of ¢, the method of the previous section can be 
used to find the probability that the test gives a significant result at 
the 5 percent level. 

Calculations were made for the case n; = 6, and g = 2,4,8. The 


Rather surprisingly, the 


8 
.881 
.866 (.88) 
j .857 (.86) 
.792 (.81) 
.767 
a 
a 
2 
4 


LINEAR RELATION AMONG VARIANCES 25 


probabilities are shown in Table IV. The corresponding probabilities 
for the ordinary F-test with 6 and 12 d.f. and with 6 and 6 d.f. are in- 
cluded for comparison. It should be noted that in the case of F’, com- 
putations were made for the apparent 5 percent levels. As shown in 
Table III, these levels were not actually at 5 percent, being slightly 
under it for U = 1, 2 and slightly over for U = 4, 16. The computa- 
tions could have been adjusted to a common 5 percent level, but it 
seemed preferable to examine the power of the test as it actually 
operates. 

In all cases the power lies between that of F (6, 12) and that of F 
(6, 6). This suggests the general result that the power lies between 
that of F (nz , n, + m2) and that of F (ns; , ,), where 7, is the smaller 
of m, , m.. I have not, however, been able to establish this result. 
The power tends to decline steadily as U = 6,/0, departs from unity. 
This again would be expected intuitively: 

A rough approximation to the power function can be obtained by 
a method due to Welch (11). By this approximation, the F’ test has 
a power equal to that of an F test with n; and n, d.f., where 

_ + 0)? +1) 


Ny Ne Ne 


The probabilities given by this method are shown to 2 d.p. in parentheses 
in Table IV. The approximation overestimates the power, but not too 
seriously. Since n, always lies between (n, + m2) and the smaller of 
M, , N2 , the approximation is in line with the general speculation made 
above. 


8. SUMMARY COMMENTS. 


So far as they go, these investigations indicate, in my opinion, that 
the F’ test is quite satisfactory when three variances are involved, 
though we are still in the dark for the case of four or more variances. 
If a more precise test turns out to be necessary, one possibility is to try 
to extend the approach used by Welch (7). The F’ test might in fact 
be regarded as a first approximation to the type of test developed by 
Welch, which will probably involve the computation of tables of sig- 
nificance levels. Another possibility, suggested by J. W. Tukey, is to 
investigate different rules for calculating the numbers of d.f. to be 
assigned to F’, in the hope of finding one under which the significance 
probability is less affected by the @,/@ ratios. For example, the rule 
might be made to depend on the significance probability, as it does in 
Welch’s method, whereas our rule does not take any account of this. 


: 


26 BIOMETRICS, MARCH 1951 
APPENDIX: THE LIMITING DISTRIBUTION OF THREE TEST CRITERIA 


In addition to the F-test, two other tests have been considered. 
The first is based on the assumption that 


can be regarded as a normal deviate. This test might be tried if the 
numbers of d.f. n; are all large. The second test is the likelihood ratio 
test. Both tests have been used in practice. They do not appear to 


me as satisfactory for small-sample work as the F’ test, but it seems 
worthwhile to describe them. 


As will be shown, all three tests are asymptotically equivalent. That 
is, if all n; tend to infinity in such a way that their ratios remain con- 
stant, the power functions of the three tests tend towards the same 
limiting distribution. The tests will be illustrated by applying them to 
the example given by Comstock and Robinson (3). 


9. THE NORMAL DEVIATE TEST. 


Henceforth, the quantity (7) above will be denoted by 


If we divide this quantity by an estimate of its standard error, the test 


criterion is 
d= 


This is regarded as a normal deviate with mean zero and unit standard 
deviation. 


The data given by Comstock and Robinson (3) are as follows. 
0.0334; = 0.0248; = 0.069 


n, = 36; N, = 180; Nm, = 144. 


The null hypothesis which corresponds to the absence of dominance is 
6; = 6, + 6. The alternative hypothesis (presence of some domi- 
nance) specifies that 6; exceeds (6, + 62). 

The value of d is , 


Nel 069? .0334 


144 36 180 
For a single-tailed test, this gives a probability of 0.176. 


4 2 
| 
| 
2 
4 vi 
4 
{a 
3 


LINEAR RELATION AMONG VARIANCES 27 


The limiting distribution of d is easily found by familiar methods. 
Write n; = na; , where the a; remain fixed as n increases. Then 


4/2 + 


Also let 


where » remains finite, and measures the amount of divergence from the 
null hypothesis. The factor +/n is needed in the denominator, because 
in large samples the test can detect a divergence of order 1/+/n. 

The numerator of d is asymptotically normally distributed, with 
mean » and variance 2 >> 6;/a;. Further, since v; converges in prob- 
ability to 6; , it follows, as shown by Cramér (12), that the denominator 
of d converges in probability to 


6; 
Hence the limiting distribution of d is normal, with mean 
6; 
and unit variance, (Cramér, loc. cit.) 


10. THE F’-TEST. 


For this example 


.069 
+0, 0334 + .0248 


with d.f. 144 and 98, where 


= 1.186, 


9g —(:0334 + .0248)° 
~ (.0884)" (.0248)” 
36 180 


The significance probability is found to be 0.179, in close agreement 
with the previous test. 


For the asymptotic distribution of F’, we note that 


Vn (F’ — 1) 


UF 
6; > 0; = 
| 
Vn oy? a; 
de 


28 


BIOMETRICS, MARCH 1951 


Since the factor by which d is multiplied on the right tends in prob- 
ability to a constant (the corresponding function of the 6;), the limiting 
power functions of F’ and d are essentially the same. 


11. THE LIKELIHOOD RATIO TEST. 


. For the present, we consider that the alternative hypothesis places 
- no restrictions on the 6; , except that they are positive: i.e. that the 
o test is two-sided. Apart from constant factors, the log. of the likelihood 
4 is 


L = —(1/2) — (1/2) Yn; log 6 


Under the alternative hypothesis, denoted by the suffix 2, the maxi- 
mum likelihood estimates are simply 6;. = v;. This gives 


L, = —(1/2) n; — (1/2) log»; 
For the null hypothesis, we minimize L subject to the restriction 


(8) 6; 6; = 0. 
This leads to the set of equations 


j where g is a Lagrange multiplier and the sign on the right is + or — 
depending on whether i belongs to >"; or >.» . 
Adding the equations, we get 


10 .. 
(10) 


> n; = 0. 


Hence, the test criterion, 


(11) 2L, — L,) = — Don; — Yn; logo; + + ¥ n; log 6; 


using (10). It may be noted that the tes’ eriteriun takes the same 
form as in Bartlett’s test of homogeneity of var‘ances. 


: 
ibe 
4: 
ow 
6 
— 
— = los} I, 
jee 
t 


LINEAR RELATION AMONG VARIANCES 29 


Equations (9) for the maximum likelihood estimates under the null 
hypothesis are non-linear, and I do not know any quick means of 
solving them. One method is to note that 


(12) DH 


By substituting first approximations to the 6; , we find an approxima- 
tion to g. From this, a second approximation to each 6; is found from 
the equation (9) in which it appears. Convergence is rather slow. 

For the numerical example, this process gives 


6, = .039558, 6, = .025304, 6, = .064862. 


039558 025304 
2(L, — L,) = 36 log ( ) +m lo ( ) 
06486 
+ 144 log ( 069 ) = .847 


When the null hypothesis holds, the limiting distribution is that of 
x’ with 1 df. 

As given here, the likelihood ratio test is two-sided. For comparison 
with the other tests, we want to make a one-sided test against the 
alternative 6; > (6, + 6.). To obtain a one-sided test in the general 
case, compute (>>, v; — >.» v;). If this has a sign opposite to that 
specified in the alternative hypothesis, stop and declare the result non- 
significant. If the sign is the appropriate one, calculate x’, and double 
the resulting probability. Since x’ has 1 d.f., we obtain the probability 
by regarding +/0.847 = 0.920, as a normal deviate. This is practically 
the same as the normal deviate, 0.930, obtained from the d test. It 
gives a one-tailed probability of 0.179. 


12, ASYMPTOTIC DISTRIBUTION OF THE LIKELIHOOD RATIO TEST. 


Investigation of this distribution presents some annoying complica- 
tions relating to the existence and uniqueness of the maximum likelihood 
estimates under the null hypothesis. From the previous section, the 
equations to be solved may be written 


(13) — 6; = +9—, @G@=1,2---k 


subject to 


J 
1 
A A 
6; — Dir = 0. 


30 BIOMETRICS, MARCH 1951 


The equations are non-linear, and in finite samples there may be 
more than one set of solutions, though it is easy to select the “sensible” 
solution. However it is difficult to express the solution in a workable 
form. To avoid the complications, we shall use a unique and explicit set 
of second approximations to the maximum likelihood estimates, rather 
than the final estimates themselves. 

These second approximations are obtained by the iterative method 
suggested in the previous section. We take the v; as first approxima- 
tions. Then from equation (12), an approximate value of the Lagrange 
multiplier g is 


Hence, as second approximations u; to the 6; we take 


2 


UF 


where in the adjustment term we have substituted v; in place of the 
unknown 6; . 


As before, the quantities n;v,/@; are independently distributed as 
x’ with n; d.f. respectively. We have n; = na; , where the a; remain 
fixed as n goes to infinity. Also 


Certain preliminary results which are needed follow at once from 


standard theorems. 
(i) The limiting distribution of g/+/n is normal with 


Mean = or : Variance = 2, 


a; 
(ii) The quantity g/n tends in probability to zero. 


(iii) Since 
2 2 
U; = 0; ga = 0; BE 
nN; n a; 


it follows from (ii) that u; tends in probability to 6; . 
In starting the proof, we cannot use the compact expression 


| 
i { 
t 
| 
7 
ie 
1 
nu 
= 
| 
ye 
q 


LINEAR RELATION AMONG VARIANCES 31 


> n; log 6./; for the test criterion, since this is valid only if the correct 
maximum likelihood estimates have been used. This point was brought 
forcibly to my notice when I obtained a x’ of minus 2.2 when substi- 
tuting rather crude estimates into this expression. Instead, we must 
use the more general results derived from equation (11). 


— Xin, log + us) 


log | 1 | >| + | 


from (iii). Expanding the log, we obtain 


3,6 


NU; 


where 0 < a < 1. But 


n JLnJLaiui 
From (i), (ii) and (iii), this tends to zero in probability, so that the — 


sum on the extreme right of (16) also tends to zero in probability. 
Finally, the remaining term, 


From (i), this is the non-central x? with 1 d.f. and parameter p?/>_ 267/a; . 
Thus the limiting distribution is essentially the same as that for d and F’. 

In conclusion, the power functions of the three tests were first ob- 
tained by G. 8. Watson: earlier, preliminary work on the likelihood ratio 
test had been done by Dr: R. A. Porter. I should like to thank Miss 
Elizabeth Grant and Miss Janice Harris, who carried out the bulk of 
the computations. 


= 
iy 
2 
Z 
has a limiting distribution which is the same as that of Sa 
lg % 
4g 
2n a; 
3 
{ 


32 BIOMETRICS, MARCH 1951 


REFERENCES 


(1) Cochran, W. G. and Cox, G. M. Experimental Designs. John Wiley and Sons, 
p. 411, 1950. 

(2) Anderson, R. L. Use of variance components in the analysis of hog prices in two 
markets. Jour. Amer. Stat. Ass. 42, 627, 1947. 

(3) Comstock, R. E. and Robinson, H. F. The components of genetic variance in 
populations of biparental progenies and their use in estimating the average 
degree of dominance. Biometrics, 4, 254-266, 1948. 

(4) Bartlett, M. S. Properties of sufficiency and statistical tests. Proc. Roy. Soc. 
Lond. A, 901, 268-282, 1937. 

(5) Fisher, R. A. The fiducial argument in statistical inference. Ann. Eugen. 6, 
391-398, 1935. 

(6) Fisher, R. A. and Yates, F. Statistical Tables. Oliver and Boyd, Edinburgh, 
Table V1, 3rd. ed., 1949. 

(7) Welch, B. L. The generalization of Student’s problem when several different 
population variances are involved. Biometrika, 34, 28-35, 1947. 

(8) Aspen, A. A. Tables for use in comparisons whose accuracy involves two vari- 
ances, separately estimated. Biometrika, 36, 290-296, 1949. 

(9) Smith, H. F. The problem of comparing the results of two experiments with 
unequal errors. Jour. G.S.I.R. (Australia) 9, 211-212, 1936. 

(10) Satterthwaite, F. E. An approximate distribution of estimates of variance 
components. Biometrics, 2, 110-114, 1946. 

(11) Welch, B. L. The significance of the difference between two means when the 
population variances are unequal. Biometrika, 29, 350-362, 1938. 

(12) Cramér, H. Mathematical methods of statistics. Princeton Univ. Press, 254-5, 
1946. 


= 4 
i 
ag 
4 
~ 
. 
ot 
Vas 


COMPONENTS IN REGRESSION* 
Joun W. TuKey** 
Princeton University 


1. Introduction. 


W: SHALL BE principally concerned with simple linear regression 
where both variates are subject to “error”, although the same 
problems and methods apply to problems involving three, four, or many 
variates. By saying that we deal with “both variates subject to error” 
we mean that two measured quantities, which we may as well call x 
and y, are thought of as made up of two independent (or at least 
orthogonal) parts as follows 


observed quantity = steady part + fluctuation, 


where what is included in the steady part and what is fluctuation 


depends on the problem and our purpose. We seek to answer such 
questions as these: 


(1) What is the relation of the steady parts as expressed by slopes? 

(2) Might the steady parts be exactly linearly related, and if so, 
with what possible slopes? 

(3) What limits can we place in general on the slopes relating the 
steady parts? 

(4) How variable are the steady parts? 

(5) If a measurement of one variate is used to replace the other, . 
what sort of steady errors are made? 

(6) How variable are the fluctuations? 


and shall give numerical examples where each can be answered. 
After this, we discuss the case of more than two variates briefly. 


*Prepared in connection with research sponsored by the office of Naval Research. 

**Presented before the American Statistical Association Annual Meeting in Chicago, 1950 at 
sessions held jointly by the Biometrics Section of the American Statistical Association and the Biometric 
Society (ENAR). 


33 


3 
{2 
: 
| 


34 BIOMETRICS, MARCH 1951 


Instrumental variates 


No one of the questions just listed can be answered on the basis of 
a bare table of paired values of x and y. Something more must be 
added! 

The questions are strongly interrelated. If we know : 


the table of paired values 
that the steady parts are restricted to a line 
the slope of this line 


then we can estimate the variances and covariance of the fluctuations 
quite easily. Conversely, if we know 


the table of paired values 
the ratios between variances and covariances of the fluctuations, 
that the steady parts are restricted to a line 


then we can estimate the slope of this line quite easily. The difficulty 
is to get started! 

All the methods which have been proposed as ways of getting started 
either 


(1) require detailed assumptions about the distribution of the 
fluctuations 


which we shall rule out on practical grounds (except perhaps in special 
cases) or 


(2) require the use of a third variate with special properties. 


This instrumental variate may have numerical values, or it may merely 
be a classification. In either case, it must be appropriately related to 
xz and y. Just what is required depends on the nature of the steady 
parts of x and y. If 


(steady part of y) = a + B (steady part of x) 


- so that each steady part determines the other linearly, then the instru- 
mental variate need only meet two conditions, namely 


its covariance with the fluctuations in both x and y shall vanish 
its covariance with the steady parts of both x and y shall not 
vanish. 


In the terms we shall use later, it must be orthogonal to the fluctuations, 
but not to the steady parts. (We are deliberately avoiding the words 
“uncorrelated” and “correlated” since the use of correlation coefficients 


iF 
| 
| 
“¥ 
de 
id 
ay 
ht 
4 
4 
| 
| 
| 
1 


COMPONENTS IN REGRESSION 35 


has helped greatly to confuse this subject.) When we require the 
“eovariance”’ of a classification and a fluctuation to vanish we mean 
merely that the averages of the populations of fluctuations appropriate 
for each cell of the classification are the same. (For a quantitative in- 
strumental variate we require only that the covariance vanish in the 
ordinary sense.) 

But we often have to deal with the case where the steady parts do 
not determine each other, although a value of either would give some 
information about the other. Let us put 


(steady part of x) = (common part of x) + (individual part of 2), 


(steady part of y) = (common part of y) + (individual part of y), 


where each individual part is orthogonal to the common part of its 
own variate and to both the common and individual parts of the other 
variate, and suppose 


(common part of y) = a + 8 (common part of z). 


In this case we must require more of the instrumental variate, namely 


its covariance with the fluctuations in both z and y shall vanish, 
its covariance with the individual parts of both x and y shall 
vanish, 

fs covariance with the common parts of both x and y shall not 
vanish. 


The careful reader will note that the decomposition of the steady parts 
into common parts and individual parts would not be uniquely deter- 
mined by a knowledge of the variances and covariance of the steady 
parts—just as the knowledge of variances and covariance of bivariate 
normally distributed quantities measured with error in both variates 
do not determine the decomposition of the measurements into errors 
and quantities. Just what is the common part is determined by our 
knowledge of the situation and our desire as to what we wish to study. 
Two experimenters with different purposes could face arithmetically 
identical data and appropriately choose different definitions of common 
and individual parts. We shall try to indicate our choice in the ex- 
amples. 

With the aid of an instrumental variate we can separate steady parts 
and fluctuations enough to break into the problem, and then give some 
answers to all the questions. 

The first appearance of the idea of the instrumental variate was due 


JER 

: 


36 BIOMETRICS, MARCH 1951 


to Wald [18] in 1940. From his verbal comments, the possibility of 
doing something here came as a real surprise. He dealt with the case 
of classification into two groups, and showed how the central questions 
could be answered. In 1949 Geary [6] put forward the idea of a quanti- 
tative instrumental variate, and introduced the word “instrumental’’. 

More recently a non-parametric procedure has been proposed by 
Hemelrijk [8]}—careful examination shows that an instrumental variate 
will be needed to make any use of this or related methods—and Berkson 
[2] has brought out and studied the important and neglected case where 
for one variate the fluctuation is independent of the observed value 
rather than of the steady value. We shall not discuss these aspects 
further. 

Both the instrumental and classical theories can be traced through 
the discussion and references in Wald [18], Geary [6], Reiers¢l [13] who 
has recently reviewed the econometric side, and Lindley [9] whose 
paper is a compressed compendium. 

It must be emphasized that we are not here concerned with the 
classical problem “given an additional z, predict its y’’. We are in- 
terested instead in learning about the structure of the problem! We 
usually know, from other sources, that the slopes and variance com- 
ponents that we consider are not zero. Thus we are not trying to 
establish their significance, but rather to set limits to the values that 
they reasonably may have (or to point to their apparent values). 
Limits which include zero, and thus fail to establish nonzeroness from 
the data at hand, are almost as likely to prove useful to us as limits 
which do not include zero. The critical question is the separation of 
the limits, and not whether or not they indicate “significance’’. 

While not identical, the correction of correlation coefficients for 
attenuation is extremely close to our topic. Any psychological claim 
for priority, however, should be accompanied with an explanation of 
why the idea was not extended to more interesting problems. (The 
writer believes that correlation coefficients have been highly “‘efficient”’ 
in shielding the essentials of this and related situations from the in- 
quiring eye. Unless and until we come to variances and covariances 
in measured, not standardized units, it is hard to meet the facts face 
to face.) 

The three fields of application where these problems have been most 
prevalent are 


precision of measurement, 
psychology, 
econometrics. 


& 

a 

i 

: 
a 

& 

4 

| 

2 

ra 

4 
7 

ae 

¥ 
oe 

a 
i 


COMPONENTS IN REGRESSION 37 


We shall discuss them a little further in connection with the more 
variate case. 


Joint analysis of variance. 


The central tool in our arithmetic will be a generalization of the 
analysis of variance to two or more variates taken jointly. One ap- 
proach has been given in [16] and another is given below. For the 
present we can leave the arithmetic details aside and merely examine 
the dictionary. The variances and covariance of the “steady parts” 
will be expressed by a “variance component between” or a “variance 
component for regression”. The slopes relating the steady parts are 
determined by these values, and are conveniently referred to in such 
terms as “slope of y on x in the between variance component”. 

We could have avoided mention of variance components, and their 
unbiased estimates, the components of mean square, in our examples, 
since we only deal with the simplest sorts of analysis of variance, be- 
tween-vs-within or regression-vs-balance. However, we have chosen to 
bring them in for two reasons. Mainly to set the analysis in terms 
which can be easily carried over to more complicated analyses where 
the correct procedure might otherwise be a mystery. Secondarily, to 
stress the analogy with variance components for a single variate. 

It seems to the writer that an understanding of variance com- 
ponents is essential to the understanding of regression with all variates 
subject to error, just as an understanding of analysis of variance is 
essential to the understanding of simple regression. 

To continue the dictionary, the problems set forth at the outset read 
as follows when translated into variance components terms: 


(1) What are the slopes in the variance component between (or for 
regression)? 

(2) Could the variance component between degenerate along a line, 
and if so with what possible slopes? (What are the possible 
slopes for a degenerate variance component for regression?) 

(3) What limits can we place on the slopes in the variance com- 
ponent between (or for regression)? 

(4) What is the variance component between (or for regression)? 

(5) What is the minimum value (B varying) for the variance com- 
ponent between for y — Bx? 

(6) What is the variance component within (or for the balance)? 

After spending a part on basic example and arithmetic, we shall spend 


four parts in illustrating the numerical solution of (1), (2) and (3), (4) 
and (5), and finally (6). 


ist 
: 
‘ee 
| 


TABLE 1 
CODED* CHEMICAL ANALYSES 


Single cross 


Sample 


w = y z 
NY3 X D59 106 38 10 7 + 
711 61 13 9 5 

617 48 13 7 4 

127 53 13 7 5 

200 49 30 18 

NY3 X MS19 702 49 12 9 3 
111 42 8 7 4 

118 50 8 7 4 

726 41 11 9 2 

182 39 32 13 

NY3 X A206 603 40 10 5 3 
311 39 9 7 3 

420 35 7 8 2 

624 35 8 7 2 

149 34 27 10 

D59 X R51 404 18 1l 5 1 
609 36 9 8 1 

619 34 10 8 1 

724 40 10 1l 2 

128 40 32 5 

MS19 X 514A 703 38 9 4 1 
709 37 5 9 1 

321 23 5 6 3 

222 21 3 5 0 

119 22 24 5 

A206 X R51 503 21 5 3 0 
614 24 7 4 2 

116 19 2 5 0 

427 21 7 3 1 

85 21 15 3 

D50 X 51A 306 27 11 5 1 
213 8 3 4 0 

115 9 2 3 0 

224 27 9 8 2 

71 25 20 3 

D50 X B8 607 15 6 2 0 
712 8 4 0 0 

119 5 0 2 -1 

328 10 3 1 0 

38 13 5 -1 

R51 X B8 406 13 6 1 0 
114 7 1 0 0 

221 7 0 1 1 

628 12 3 | 0 

39 10 3 1 


*w = 10(% crude protein — 8.0%) 
z = 100(% lysine — 0.25%) 


v = 100(%methionine — 0.10%) 


z = 100(% tryptophane — 0.08%). 


i 
ae 
4 
4 
4 
4 
la 
aid 
+4 
| 
4 


COMPONENTS IN REGRESSION 39 


Acknowledgement. 


I should like to express my thanks to all those who have helped 
this development, particularly for the very helpful comments, criticism, 
and suggestions of H. Fairfield Smith, the suggestion of problem (5) 
by J. W. Hopkins, helpful discussions with Forman S. Acton and R. F. 
Link, the detection of errors by D. B. DeLury and R. F. Link, and the 
courtesy of W. R. Flach in providing the unpublished data used in the 
numerical examples. ; 


BASIC EXAMPLE AND BASIC ARITHMETIC 


The basic example 


In all our discussion of the two-variate case, we shall use an ex- 
periment by Miller, Aurand and Flach [10] for illustration. This 
experiment on amino-acid and protein contents of corn was intended 
to throw light on an earlier suggestion that selection for specific amino 
acids, rather than for protein, might lead to a more nutritive variety 
of corn. The experimenters concluded that the experiment which we 
shall use gave no support for this view, since the amounts of lysine, 
methionine, tryptophane and crude protein, as found by analysis in four 
samples from each of nine single crosses, seemed to vary together. Our 
considerations will also support this conclusion. The basic data are 
given in Table 1 in coded form. 

In our analyses we shall treat these observations as though the field 
design had been a simple randomization of 36 plots. (The actual plots 
formed part of a larger field experiment arranged in a lattice. Since 
the field was unusually uniform, and plots used were distributed ir- 
regularly through the lattice, the neglect of block effects will not be too 
serious.) The field samples from each plot were reduced by quartering 
and brought into solution. Then aliquots were taken for the various 
chemical determinations. 


The arithmetic process 


The process that we shall use throughout is just the simple and 
natural extension of familiar analysis of variance procedures to two or 
more variates. The basic arithmetic, but not the motivation or inter- 
pretation, will be familiar to those accustomed to the analysis of co- 
variance. 

As an example, we consider the x and y values, as coded, for 2 of 
the 9 single crosses. We would have no hesitation in making an analysis 
of variance on x alone, on y alone, or on y — 2a alone. We can also 
make one on y — Bz, where B is still at our disposal after the analysis 
is complete. We have: 


= 
: 
= 
a 
: 


40 BIOMETRICS, MARCH 1951 


Single-cross y — 2x y — Bz x? y? xy 
D50 X B8 6 -2 —10 2 -— 6B 36 4 12 
4 0 — 8 0 —4B 16 0 0 

0 2 2 2 0 4 0 

3 1 - 5 1— 3B 9 1 3 

Total 13 5 —21 5 — 13B 61 9 15 
R51 X B8& 6 1 -11 1 — 6B 36 1 6 
1 0 —2 Oo- B 1 0 0 

0 1 1 1 0 1 0 

3 1 — 5 1— 3B 9 1 3 

Total 10 3 —17 3 — 10B 46 3 9 
Grand Total 23 8 —38 8 — 23B 107 12 24 


From this we can easily find the sum of squares of all individual 
values 
for for for for 
Item y y — 22 y — Br 


(I) 107 12 344 12 — 2(24)B + 107B’ 


where we have used (y — 22)? = — 4 ay +4 > 2’ and the 
corresponding relation for >> (y — Bz)’. In particular, if we put 
B = 2 in the entry for y — Bz we find 344 as we should. 

Similarly, one-fourth of the sum of squares of totals over single- 
crosses is found as 


for for for for 
Item x y y — 2x y — Br 


(II) 67.25 8.5 182.5 8.5 — 2(23.75)B + 67.25B° 


where 


67.25 = 3(13° + 10°), 
8.5 = + 3°), 


23.75 = 3((5)(13) + (3)(10)), 


182.5 = 3((121)? + (—17)”), 
and the entry for y — 2z can also be found by placing B = 2. 


| 
* 
ar 
: 
: 
jy 


COMPONENTS IN REGRESSION 41 


Finally, one eighth of the square of the grand total is found as 


for for for for 
Item g y y — 2x y — Be 
(IID 66.125 8 180.5 B — 2(23)B + 66.125B’ 


where 
66.125 = 323’, 8 = 48’, 23 = 3(8)(23), 180.5 = 3(—38)’. 
If we were analyzing x alone, we should almost automatically set 


Sum of squares between 
= SSB = (II) — (III) = 67.25 — 66.125 = 1.125, 
Sum of squares within = SSW = (J) — (IJ) = 107 — 67.25 = 39.75, 
Mean square between = MSB = SSB/1 = 1.12, 
Mean square within = MSW = SSW/6 = 6.62, 
Component of mean square between 
= CMSB = 1(MSB — MSW) = i(1.12 — 6.62) = —1.38, 
Component of mean square within = CMSW = MSW = 6.62. 
Here CMSB = —1.38 is the customary unbiased estimate of the com- 
ponent of variance between single crosses. The fact that it is negative 
is surely a sampling fluctuation, since the component of variance cannot 
be negative, and is almost surely positive. 

We can carry out this same procedure for x alone, y alone, y — 2x 
alone, and y — Bz alone, with the results shown in Table 2. If we 
observe that the real function of the B in the analysis of y — Bz is to 
label and keep separate the coefficients of 1, —2B, and B’, we are led 
to carry just the coefficients alone. The natural arrangement is, for 
example, to replace 3.5 — 2(0.25)B + 39.75B’ by 


(x) 39.75 
0.25 
(y) 3.5 


where 39.75 refers to x alone, 3.5 to y alone and 0.25 to cross-products. 
For economy of printing we shall use the arrangement 


(x) 39.75 0.25 


(y) 3.5 


but we advise the reader to think of the numbers in the symmetrical 
arrangement. 

The resulting joint analysis of variance is shown in the last column 
of Table 2. 


J 

: 

5 


42 BIOMETRICS, MARCH 195! 


TABLE 2 
ILLUSTRATIVE ANALYSES OF VARIANCE FOR EXAMPLE 
Analysis of Joint 
Item 
z y y —2z y — Br 
SSB = 
al — (Ib 1.125 0.5 2.0 0.5 — 2(0.75)B + 1.125B2 (xz) 1.125 0.75 
0.5 
SSW = 
(I) — (ID | 39.75 3.5 |209.5 3.5 — 2(0.25)B + 39.75B? (x) 39.75 0.25 
3.5 
MSB = 
SSB} 1.125 0.5 2.0 0.5 — 2(0.75)R + 1.125B? (xz) 1.125 0.75 
0.5 
MSW = 
(1/6) SSW | 6.62 0.58 | 34.92 0.58 — 2(0.04)B + 6.62B2 (xz) 6.62 0.04 
(y) 0.58 
CMSB = 
(1/4) (MSB—MSW) |-—1.38 |-—0.02 |—8.23 |—0.02 — 2(—0.18)B — 1.38B?]| (2) — 1.38 — 0.18 
(y) — 0.02 
CMSW = 
MSW | 6.62 0.58 | 34.92 0.58 — 2(0.04)B + 6.62B% (x) 6.62 0.04 
(y) 0.58 


The joint analysis, with its 3 entries in the two variate case, is 
convenient and symmetrical. We shall use it from this point on. 


Note on regression 


If we are given a joint sum of squares for x and z, say 


(x) 1384.88 571.88 


(2) 302 


then the decomposition of the sum of squares for z is 


2 2 
571.88" _ 392.70 + 1082.18 


(1384.88 ~ 302 


where 302.70 is the sum of squares for the balance, and 1282.18 is the 
sum of squares for the regression of x on z. 

Similarly, if we are given a joint sum of squares for y~ Sx and «¢, 
say 


ig 
ta 
| 
{ 


COMPONENTS IN REGRESSION 43 
(y — Bx) 984.88 — 2(1069.12)B + 1384.88B” 418.33 — 571.88B 


(z) 302 


we have an exactly similar decomposition of the sum of squares into 


= 2 
{osa.ss — 2(1069.12)B + 1384.88B? — (418.33 571 


302 


(418.33 — 571.883) 
302 


= {405.44 — 2(277.25)B + 302.70B’} 


+ {579.44 — 2(791.87)B + 1082.18B"} 


Clearly we have done the same thing as for x alone and z, although the 
result looks a little more complicated. 

We can simplify matters somewhat by going over to the joint 
analysis of x, y, z. If we ask for the sum of squares for y — Bx — Cz, 
we find 


984.88 — 2(1069.12)B + 1384.88B? 


— 2(418.33)C + 2(571.88)BC + 302C? 


and in terms of a joint analysis, the decomposition of the sum of squares 
is based on 


(x) 1384.88 1069.12 571.88 
(y) 984.88 418.33 


(2) 302. 


and is 


(x) 1384.88 1069.12 302.70 277.25 1082.18 791.87 


+ 
(y) 984.88 405.44 579.44 
where 
2 
1082.18 = 571.88 791.87 = (571.88)(418.33) 579.44 = 418.33 d 


302 ’ 302 : 302 


. 
: 


BIOMETRICS, MARCH 1951 


TABLE 3 


JOINT ANALYSIS OF VARIANCE OF vw, 2, y, z 
BETWEEN AND WITHIN SINGLE CROSSES 


(w) | w? wr wy wz 36601 8817 6601 2352 
(x) | x? zy xz 2285 1599 560 
(y) | y? yz 1292 408 
(z) | 22 pattern of entries 183 (1) 
(w) | 35275.25 8536.5 6450.25 2290 28392.25 7105.08 5279.67 1600.75 
(x) | 2124.25 1588.5 543.5 1778.03 1321.22 400.58 
(y) 1228. 402.25 -981.78 297.67 
(z) 165.75 (II) 90.25 (IIT) 
(w) | 6883. 1431.42 1170.58 689.25 | 1325.75 280.50 150.75 62. 
(x) 346.22 267.28 142.92 160.75 10.50 16.50 
(y) 246.22 104.58 64. 5.75 
(z) 75.50 SSB = (II) — (III) 17.25 SSW = (I) — (II) 
(w) 860.38 178.93 146.32 86.16 49.10 10.39 5.58 2.30 
(x) 43.28 33.41 17.86 5.95 0.39 0.61 
(y) 30.78 13.07 2.37 0.21 
(z) 9.44 MSB = (1/8) SSB 0.64 MSW = (1/27) SSW 
(w) 202.82 42.14 35.18 20.97 49.10 10.89 5.58 2.30 
(x) 9.33 8.26 4.31 5.95 0.39 0.61 
(y) 7.10 3.21 2.37 0.21 
(z) 2.20 0.64 

CMSB = (1/4) (MSB — MSW) CMSW = MSW 


All regressions of two or more variates, jointly analyzed, on another 
single variate can be handled in exactly this way. 


Basic arithmetic 


If we apply the joint analysis of variance to the coded observations 
of Table 1, taking w, x, y, z all together, we obtain the basic values, 
sums of squares, mean squares, and components of mean square shown 
in Table 3. The designations (J), (IJ) and (IIJ) refer to quantities 
exactly analogous to those similarly designated in the illustrative 
example. Thus 36601 is the sum of all individual w’s; 35275.25 is one 
fourth the sum of the squares of single-cross totals of w’s; 28392.25 is 
one thirty-sixth the square of the grand total of w’s. 


i 
4 
44 
| 
} 
~ 
4 
og 
ts 
he 
ip 
i 
‘ 
‘ 


COMPONENTS IN REGRESSION 45 


WHAT SLOPE IS INDICATED? 
Basis for analysis 


We are going to use various sorts of analyses between and within 
to estimate the slope in the between variance component. We shall 
use them on the single example for compactness, though they would 
usually be alternatives. We are always relying on the fact that the 
component of mean square between is an unbiased estimate of the 
component of variance between. Thus, the slope in CMSB is a more 
or less reasonable estimate of the slope in the between variance com- 
ponent. 

We leave the setting of limits on slopes to the next part, and merely 
state here the analyses of variance and the indicated slopes. 


Categories 


The first form of analysis arises when the observations may be 
divided into categories, where those within any one category may be 
considered as replicated. This we may do in the corn example by 
dividing the 36 samples into 9 groups of 4 according to the single cross 
used. The analysis of variance may be found by extending the 2 single- 
cross numerical example (Table 2) to all 9, or just by picking some 
figures out of the joint analysis for all four variates (Table 3). It is 
given in Table 4. 


TABLE 4 
ANALYSIS OF VARIANCE BY CATEGORIES FOR x AND y 


DF MS CMS 
Between 8 (x) 43.28 33.41 9.33 8.26 
single crosses (y) 30.78 7.10 
Within 27 (x) 5.95 0.39 5.95 0.39 
single crosses (y) 2.37 2.37 


The slopes—in the between variance component—of y on x and of x 
on y can be estimated easily as 


8.26 


=” 0.885 (slope of y on x between) 


8.26 _ 1.163 (slope of z on y between) 


* 
= 
ble 
= 
| 
. 


46 BIOMETRICS, MARCH 1951 


In an actual population, an actual sample, or in the between variance 
component itself, the product of such slopes would be <1 (being in 
fact the square of the correlation coefficient). Here the product is 


(0.885) (1.163) = 1.029. 


This is evidence of sampling fluctuation and of a biased selection among 
sampling fluctuations, and not cause for alarm, just as are the cases 
where a one-variate component of mean square turns out negative. 
(In fact the product is greater than unity just because the component 
of mean square for (+/7.10 x + +/9.33 y) is negative!) We do not 


throw out these indicated slopes, we merely note that they are likely 
to be a little high. 


Regression 


In the second form of analysis “between” is the single degree (or 
degrees) of freedom for regression on a (or several) quantitative instru- 
mental variates. This was the form proposed by Geary [6]. As an 
example, we can take 


z = coded amount of tryptophane by analysis 


and work with the sums over four plots of the values of z, y, z for the 
nine single crosses. The sum of squares between on this basis will be 
just 4 times the SSW already given on a per plot basis. Calculating 


the degrees of freedom for regression as illustrated above, we find the 
analysis in Table 5. 


TABLE 5 
ANALYSIS OF VARIANCE FOR REGRESSION ON z FOR z AND y JOINTLY 
DF MS CMS* 
Regression 1 (x) 1082.18 791.87 3.44 2.49 
on z (y) 579.44 1.73 
Balance 7 (x) 43.24 39.61 43.24 39.61 
(y) 57.92 57.92 


*For regression this is (MS regression — MS balance) /(SS total for z). 


Here the estimated slopes are 


a" 0.72 (slope of y on x) 


a | 
| 
| 
| 
| 
| 
| | 
| 
: 
3 
— 
+ 
ho 


COMPONENTS IN REGRESSION 47 


and 


© 


4 


= 1.44 (slope of x on y) 


_ 


again with 
(0.72)(1.44) = 1.04 


coming out somewhat higher than unity. (This will always happen 
with components based on 1 degree of freedom.) 

If we recall the plan of the experiment, we can see why z is not a 
satisfactory instrumental variate, since fluctuations in z should not be 
expected to be orthogonal to those in x and y. Field plot variation and 
the fluctuations involved in selecting and bringing into solution the 4 
chemical samples for each single cross might be expected to enter 
simultaneously in the fluctuations in the measured amounts of lysine, 
methionine and tryptophane. This gives every reason to doubt orthog- 
onality of fluctuation. 

Even in this experiment, we could obtain data where orthogonality 
might be expected by a trick. This would involve selecting, at random, 
one sample from each single cross to provide the lysine value, another 
for the methionine value and a third for the tryptophane value. This 
would involve discarding a lot of data, in a random but arbitrary way 
and we shall not pursue it further. However, our difficulties here with 
lack of orthogonality in sampling and measurement point toward one 
sort of precaution that is always needed in an experiment to be analyzed 
with an instrumental variate. 


Another regression example 


A more natural application of this method is brought to light if 
we seek to study the slopes within single crosses. We have admitted 
that plot differences, field sampling, and common chemical treatment 
may have produced common fluctuations in x and y. Let us see what 
we can learn about this common fluctuation component. The 27 degrees 
of freedom within single crosses provide evidence on this point and may, 
of course, be analyzed just as though they came from the differences 
within 28 replicate samples from one single cross. 

We need only take the sums of squares and products within single 
crosses for x, y and z, and then take out the one degree of freedom for 
regression. The resulting numbers are in Table 6. 


| 
4 

: 
4 
| 
| 
| 

| 

3 


BIOMETRICS, MARCH 1951 


TABLE 6 


JOINT ANALYSIS OF VARIANCE 
(WITHIN SINGLE CROSSES) OF z AND y FOR REGRESSION ON z 


DF MS CMS* 

Regression 1 (x) 15.78 5.50 0.59 0.31 
on z (y) 1.92 —0.03 

Balance 26 (x) 5.58 0.19 5.58 0.19 
(y) 2.39 2.39 


*Regression divisor = SS total for z = 17.25. 


We see that sampling fluctuations have driven the between component 
for y negative, so that we cannot expect much accuracy from other 
parts of the analysis. The fact that variance ratios for both x and y 
fail to reach the 5% level of 4.32 adds to this impression. The indicated 
slope of y on x of 


= 0.53 


is reasonably close to the category value of 0.88, probably within 
sampling error while the indicated slope of x on y of 


0.31 

—0.03 
is surely meaningless. We have learned little in this analysis, but while 
far from “significance” have learned something about slopes ‘“‘within’”’. 
This illustrates the only procedure which seems available for obtaining 
information about such common fluctuations within the finest classi- 
fication of an analysis of variance. 


Groups 


The third form of analysis involves selecting two or more groups of 
observations in some way which we believe to be orthogonal at least to 
the fluctuations in z and y. In the corn example, the total protein 
content in some previous experiment would serve nicely. So would 
rumor as to their protein yields, or the corn breeder’s prediction in 
advance of the experiment. These were not available to me, but, after 
the fact, we can note that the pairs of single crosses with the two highest 
and the two lowest crude protein yields are significantly [17] separated 
in mean yield from the remaining 5 single crosses. 

If the between variance component is really degenerate, that is to 


| 
48 
| 
| 
| 
in 
| 
| 
| 
yer 
| 
| 
| 
of | 
| 
| 
| 
| 
‘ile 
| 
| 
| 
4 


COMPONENTS IN REGRESSION 49 


say if the steady parts of x and y determine each other linearly, then 
we have only to ask of this division into groups that it be orthogonal to 
the fluctuations in x and y. For the reasons just discussed in connection 
with the regression example, the fluctuations in z could not be expected 
to be orthogonal to those in x and y. But the groups we selected were 
significantly separated from the other groups, so that we can believe that 
the fluctuations in z did not determine the separation into groups. We 
may conclude that the instrumental variate is valid in this case. 

If the between variance component does not degenerate, then we 
must also require that the groups are orthogonal to the individual parts 
of x and y. If we beg Miller, Aurand and Flach’s original question, and 
assume a single genetically controlled factor, then this additional orthog- 
onality might seem reasonable. Otherwise we would have to say that 
the validity of this instrumental variate was doubtful. 

We shall go ahead and use the two groups of 8 samples each from 
the highest and lowest pairs, discarding the 20 samples from the 5 
intermediate single crosses. The 16 determinations remaining give rise 
to an analysis of variance with 1 DF between groups and 14 DF within. 
The numbers are in Table 7. 


TABLE 7 
HIGH VS. LOW GROUPS—ANALYSIS OF VARIANCE FOR x AND y JOINTLY ©— 
DF . MS CMS 
High vs. low 1 (x) 264 219 32.3 27.3 
(y) 182 22.6 
Within 14 (x) 5.24 0.50 5.2 0.50 
(y) 0.82 0.8 


39.3 0.85 
27.3 
1.21 


whose product 
(0.85) (1.21) = 1.03 


is again slightly more than unity. 


| 
| | 
The apparent slopes for the high-vs-low variance component are eo 
| 


50 BIOMETRICS, MARCH 1951 


Although we divided into three groups and discarded the center one 
as Bartlett [1] has proposed, we did not adopt his suggestion of arbi- 
trarily cutting up the observations into groups according to their z- 
values. We used significant cuts (on w rather than zx, though this is 
irrelevant to the present point). His procedure is subject to some bias, 
though the bias is much less than the bias produced by entirely neg- 
lecting the presence of errors in x. The group division here corresponds 
to defining 


Gis 
e’= 0, a 
+1, ty 


Clearly the cases where the true value of x is near x, or 2 produce a 
small correlation between z and the fluctuations in z. In many instances 
the correlation, and the consequent bias in slope, are negligible, in others 
they must be accepted because there is no method other than 
Bartlett’s which will give a reasonable estimate of the slope. In such 
cases we should, of course, use Bartlett’s method. 

We neglected the central 5 single crosses, and combined the end 
two. This may seem incautious and wasteful, yet the results of Nair 
and Shrivastava [11, 1942, esp. p. 124] indicate that if the single-cross 
means were equally spaced, the efficiency would be 82% when only 
one variate is subject to error. In our case we might well reach effi- 
ciencies of 90-95% because we have taken advantage of irregular spacing. 


Discussion 


All these three sorts of analyses are essentially alike—they depend 
on an instrumental variate—in the examples the value of this variate 
was: 

the name of the single cross (categories) 

the tryptophane content by analysis (regression) 

“high”, “discarded”, or “low” crude protein (groups) 
In each case we had to assume that the instrumental variate was or- 
thogonal to the fluctuations—and in the nondegenerate case to the 
individual parts—but not to the underlying quantity of interest. In 
every case we performed a joint analysis of variance between and 
within, and took the slopes from the CM SB. All procedures are alike— 


.very much alike—and it seems likely that future procedures will fall 
into the same mold. 


Jp another way there are great differences. We face the great 


8 (| 
4 
| 
| 
> 
= 
a 


COMPONENTS IN REGRESSION 5l 
division between sheep and goats—the one that dominates the choice of 
significance tests in the analysis of covariance—was “this” unaffected 
by the results of the experiment, or wasn’t it? In the interpretation of 
tests of significance, we have constantly to ask “Might the covariate 
have been influenced by the treatments?” and if the answer is ‘‘yes” 
we have to interpret the result with very considerable caution. In our 
present field, we have constantly to ask ‘‘Might the instrumental variate 
tend to fluctuate with the two variates of interest or to vary with their 
individual parts?” and if the answer is “‘yes” we have to interpret the 
experiment with very considerable caution. 

In our particular example, we can examine the answers in the various 
cases. The reasonable conclusions seem to be: 


Instrumental variate Degenerate case Nondegenerate case 
Name of single-cross No No 
Tryptophane analysis Yes Yes 
Last year’s protein yield No Yes 
Rumor about protein yields 

as of last year No Yes 

after experiment Yes Yes 
High and low protein groups 

Separation of groups of No ¥ 

single crosses “significant”’ if 

“not significant” Yes Yes 


The variety of questions raised in getting these answers suggest that 
this is a substantial and useful branch of experimental design which 
needs much attention 


WHAT SLOPES CAN BE EXCLUDED 


F-tests and quadratic equations 


We shall find that we can set approximate limits on our slopes by 
means of F-tests, and this will, in every case, lead us to a quadratic 
equation. These quadratics arise from the constant appearance of mean 
squares for y — Bz, where B is still a trial value, but is regarded for 
the moment as a contemplated value of the slope 8. 

The simplest case arises when we assume that the between variance 
component degenerates along a line. We have only to write out the 
variance ratio (between vs. within!) for y — Bz and state that it should 
be no larger than the critical value of F. This leads to a quadratic in B. 

More arithmetic occurs when we admit that the between variance 


7 
; 
the 
> 
é 


52 BIOMETRICS, MARCH 1951 


component need not be degenerate—here we must work with the two 


slopes, y on x and x on y, separately. It is shown in the appendix that 
we may take 


MSB.MSB,-», MSW.MSW,-». 
DEB DFW 
(MSB, — MSW, 


as an estimate of the variance of the estimate of slope y on x provided 
that we have at least two degrees of freedom between. Here MSB = 
mean square between, 1/SW = mean square within, DFB = degrees 
of freedom between, etc. Note that this variance depends on B, so 
that the é-test, first written as 


(estimated slope y on x) — B 
— 2¢,B-+ 
is naturally converted into 


<t 


((estimated slope y on x) — B)* < F(e) — 2c,B + cB?) 
where F is on 1 and DFB degrees of freedom. This is another quadratic 
in B. 
Before applying these procedures to the corn example, we shall work 


an illustrative example which is tar from degeneracy. The analysis of 
variance is given in Table 8. 


TABLE 8 
ARTIFICIAL EXAMPLE OF JOINT ANALYSIS 
DF ) MS CMS 
Between 10 (zx) 100 33 9 3 
(y) 135 12 
Within 99 (x) 10 3 10 3 
(y) 15 15 


The apparent slope of y on zis 3/9 = .333 and that of x on y is3/12 = 
.250. 


The mean squares for y — Bz between and within are 
135 — 2(33)B + 100 B’ 
and 
15 — 2(3)B + 10B° 


Ag 
ure 
| 
it 
BS 
A 
|: 


COMPONENTS IN REGRESSION 53 


and the condition that B be a possible (with 5% risk) slope of de- 
generacy for the between variance component is 


135 — 2(33)B + 100B? 
15 — 2(3)B + 


where 1.93 is the upper 5% point of F on 10 and 99 degrees of freedom. 
This condition is not fulfilled for any real B (In fact the ratio never 
falls as low as the upper 0.06% point of F.) Thus we conclude (with 
less than 0.06% risk of error) that the variance component does not 
degenerate. (The corn example will present cases where we conclude 
that “if the variance component does degenerate, it does so with a slope 
between ...and...’’!) 

Now let us face up to a between variance component that does not 
degenerate, and begin with a slope of y on x. The estimated variance 
of the slope is, according to the formula above 


100(135 — 2(33)B + 100B’) , 10(15 — 2(3)B + 10B’) 
10 + 99 
(100 — 10)’ 


< 1.93 


= 0.1668 — 2(0.0408)B + 0.1236B’. 


Now the upper 5% point of F on 1 and 10 degrees of freedom is 4.96 
so that the inequality becomes 


(.333 — B)? < 4.96 (0.1668 — 2(0.0408)B + 0.1236B?) 
or 


—.716 — 2(.131)B + .887B’ < 0. 


The extreme values of B for which this holds are 


Nie 16 9.34 + 1.40 = —1.06 or 1.74. 
.387 387 387 


Thus we conclude that the slope of y on zx in the between variance 
component lies between — 1.06 and 1.74 


If we follow the same process for the slope of x on y, and denote its 
trial value by B’, we have 


135(100 — 2(33)B’ + 135B") 5(10 — 2(3)B’ + 15B”) 
10 99 
(185 — 


= 0.0939 — 2(.0310)B’ + (0.1267)B” 
and 


4 
‘ 
oN 


54 BIOMETRICS, MARCH 1951 


(.250 — B’)? < 4.96 (0.0939 — 2(.0310)B’ + (0.1267) B’) 


whence the slope of z on y in the between variance component lies with 
5% risk between —0.81 and 1.33. 

We cannot say very much about slopes in this artificial example, 
we cannot say that they are “significant’’, but we have been able to 
exclude a good many possible values. This is some progress. 


The corn example 


The three sorts of analyses lead to the following mean squares, in- 
equalities and limits for the slope of the variance component if degenerate 


Categories Regression Groups 
Table 4 5 7 
MSW 2.37 — 2(0.39)B 57.92 — 2(39.61)B 0.82 — 2(0.50)B 
+ 5.95B? + 43.24B? + 5.24B?, 
MSB 30.78 — 2(33.41)B 579.44 — 2(791.87)B| 182 — 2(219)B 
+ 43.28B? + 1082.18B? + 264B? 
DFW 27 7 14 
DFB 8 1 1 


Inequality | MSB <2.30MSW | MSB < 5.59 MSW | MSB < 4.60 MSW 

Equation | 25.33 — 2(32.51)B 255.67 — 2(570.45)B | 178.2 — 2(216.7)B 
+ 29.60B2 = 0, + 840.47B? = 0, + 239.9B2 = 0 

Roots = 

Limits(5%) .51 and 1.69 .28 and 1.07 .63 and 1.17 


If we admit that the between variance component need not be de- 
generate, and that we should therefore be setting limits for two separate 
slopes, we can treat the categories example in the same way that we 
just treated the artificial example (our regression and groups examples 
involve only one degree of freedom between, and hence should not 
be treated in this way—even if their instrumental variates were valid). 
We find using values from Table 4 as those from Table 8 were used above. 


Slope y on z Slope z on y 
Estimated variance of slope | 0.120 — 2(0.130)B 0.206 — 2(0.159)B 
+ 0.169B? + 0.147B? 
Quadratic equation 0.145 — 2(0.193)B 0.082 — 2(0 239)B 
+ 0.101B? = 0 + 0.218B? = 0 
Roots = limits (6%) 0.42 and 3.40, 0.19 and 2.0, 


4 
a 
4 
tated 
te 
ng 
at 
| 
| 
| 
| 
1 
| 
| 
| 
fe 
i 


COMPONENTS IN REGRESSION 55 


and note that when we give up the assumption of degeneracy, our 
limits become much more widely separated. 


LIMITS FOR THE BETWEEN VARIANCE COMPONENT 


Modified form of Bross’s approximate limits 


Bross [3] has proposed approximate fiducial limits for an upper 
variance component. In order to set symmetrical 90% limits, we need 
4 tabular 5% points of F. If we are to apply this procedure to the 
category analysis of the corn example, where the upper variance com- 
ponent had 8 DF, the lower variance component had 27 DF, we need 
the tabular 5% points 


DF 8 and 27 8 and 27 and 8 eo and 8 


Tabular entry 2.31 1.94 3.10 2.93 
and the Bross upper and lower limits may be put in the forms 
MSB — 2.31MSW 


1.94MSB — 2.31MSw C™SB) 
and 
usp _ usw ©™SB) 
2.93 ~ 3.10 


Plausibility considerations are advanced in the appendix for restricting 
the use of these limits to the case (in the example, where MSB > 
2.31 MSW), where significance is exhibited, and for using, when MSB < 
is not significantly greater than MSW, the lower limit zero, as Bross 
proposed, and the modified upper limit 


MSW 
MSB — 3.10 
4/2.93 


where the 4 is the divisor used, in this example, in finding CM SB from 
MSB-M SW. 

The results of using these procedures on 2, y, and on y — Bz for 
several values of B, are in Table 8. There are two reasons for presenting 
several values of B. First, to emphasize the fact that B is still a trial 
value, and is still a matter of choice—we may choose it to suit our 
purpose. Second, to illustrate how the limits on y — Bz vary with B 
in this case, and to provide some background for the next topic 


ra 


56 BIOMETRICS, MARCH 1951 


TABLE 8 


BROSS OR MODIFIED BROSS LIMITS (5% EACH TAIL) 
FOR VARIOUS VARIANCE COMPONENTS—BASED ON TABLE 


Variance component 
Variate MSB MSW Signif. 

at 5% Lower | Estimate Upper 

limit by CMSB limit 

x 43.28 5.96 = 3.92 9.33 30.0 

y 30.78 2.37 + 3.31 7.10 21.9 

y — 0.22 19.15 2.45 + 1.79 4.18 13.4 
y — 0.42 10.97 3.01 + 0.56 1.99 7.17 
y — 0.62 6.27 4.05 - 0.00 0.56 3.64 
y — 0.82 5.02 5.56 - 0.00 —0.13 2.37 
y — 1.0z 7.24 7.55 _ 0.00 —0.08 3.52 
y — 1.22 10.02 0.00 0.72 7.09 
y — 1.42 22.06 ~ OF - 0.00 2.28 13.1 
y — 1.62 34.67 lo. _ 0.00 4.58 21.5 
y — 1.82 50.73 20.28 + 0.58 7.62 31.3 
y — 2.0z 70.26 24.65 + 1.91 11.4 44.3 


A similar procedure could be followed for the other methods of 
analyzing the corn example, but the estimation of a variance com- 
ponent on one degree of freedom is not likely to prove profitable. 


Replacement variance 


The type of analysis just carried out is useful in replacement situa- 
tions. Let us suppose that y is the result of a standard, slow and an- 
noying procedure of chemical analysis, while x is obtained by a tenta- 
tive, quick and smooth chemical procedure. We naturally wish to re- 
place the use of y by the use of a suitably calibrated x. Before we make 
the change, we must be sure of two things 


(a) the fluctuations in x are small enough for our purposes, 
(b) the individual part of y — 8x is constant enough. 


In (b) we have allowed for the fact that y and x measure slightly 
different things, that no matter how you calibrate x, there will be 
certain sorts of samples for which “y” and “calibrated 2x’’ will differ 
on the average. In the analyst’s language, these sorts of samples con- 
tain constituents which “interfere” slightly with one method or the 
other. We must live with the corresponding systematic errors when we 
change from y to x. How large are they? 


| 
| 
ty 
| 
% 
| 
| 
| 
| 
ac 
| 
24 
| 
14 
3 
i 


COMPONENTS IN REGRESSION 57 


Some will say that we should be concerned with the systematic errors 
of “x calibrated” and not with the systematic differences from “‘y’’. 
But every “true value” from which we are to measure an error is de- 
termined in some way—by some measurement or analysis. Just let 
this be ‘“‘y’’. 

To deal with (a) we need only consider the lower variance com- 
ponent, and so we postpone this question to the next part. Clearly (b) 
can be paraphrased to read: 


(b’) When @ is chosen to minimize the upper variance component 
for y — Bx, how big is this minimized variance component? 


The statistical difficulties arise from the fact that the seemingly best 
value of @ is never the actually best value of 6. It is a consequence of 
this—we shall not trace the steps—that, on the average, the seemingly 
best 6 seems better—seems to have a smaller variance component—as 
judged from the minimized component of mean square—than the 
actually best 8 actually is! We need to make some allowance for this. 
Exact theory does not seem to be available, so we shall propose a rule 
of thumb based on the +c principle. (Some attention is given to the 
plausibility of this rule in the appendix.) 

We begin with the indicated slope of y on x (from Table 4) of 

8.26 


9.33 0.885 


and its approximate variance (from the previous part) of 
0.120 — 2(0.130)B + 0.169 B? 


The trial values of B which are at the “to points” are found by 
equating this to 


(0.885 — B)? 


of which the roots are 0.74 and 1.07. If we apply the procedure of the 
last section to these values we find 


Minimized variance component 


Lower Estimate Upper 

Variate MSB MSW limit (CMSB) limit 
y — 0.74x 5.03 5.06 0.00 — .01 2.49 
y — 1.072 8.83 8.34 0.00 0.12 4.50 


: 
: 
¥ 
| 
| 
i 


58 BIOMETRICS, MARCH 1951 


Our rule of thumb is to average those two cases, and thus to conclude 
that the indicated irremovable variance due to systematic errors is 
0.06 and that crude 90% limits run from 0.00 to 3.50. While the 
accuracy of these approximate limits is unknown, they should still be 
useful in providing a guide for the user. (We reiterate that we were 
not trying to estimate the calibration, but rather to judge how good 
the ideal calibration would be.) 


WHAT ARE THE ERROR VARIANCES? 
Categories 


When we can make a straightforward analysis of the observations 
into between and within, as was the case in the category treatment of 
the corn example, we have a direct estimate of the variances and co- 
variances of the fluctuations in x and y. These are presented in the 
within mean square, and offer no novel problems. However, this is not 
the only sort of variability that we can assess. 

If we suppose that (1) plot differences and field sampling give rise 
to differences in x and y which are really confined to the same line as 
the effects due to changing the single-cross, and (2) there are no other 
sources of covariance between x and y within a single cross, then we 
can combine the estimated slope of y on x between, with the mean 
square within to estimate the variances of the remaining fluctuations. 
In the example we have 


slope y on x = 0.88, 
5.95 0.39 
mean square within = 
2.37 
The mean product of 0.39'corresponds to a variance component of 
0.39 
0.34 


where 0.34 = (0.39)(0.88), 0.44 = (0.39)/(0.88). Subtracting this, we 
have the estimates 


variance component peculiar to x = 5.51 


variance component peculiar to y = 2.03 


which are to be compared with the results of the regression method 
applied to the 27 degrees of freedom within (Table 6), which were 


i 
| 
' 


COMPONENTS IN REGRESSION 


variance component peculiar to « = 5.58 
variance component peculiar to y = 2.39 


(It will be recalled that the last value corresponded to a negative com- 
ponent of mean square between, and hence is known to involve a con- 
siderable sampling fluctuation.) 

This procedure is particularly simple when, as in some measurement 
problems (e.g. Whitwell et al [15, 19]), the errors in x and y are so small 
compared to the differences between categories that we can safely 
obtain the slope from any sort of fitting of a straight line to the category 
means. This situation arises naturally when two equally “standard” 
methods of measurement, which can be used on the same specimen, are 
to be compared and exact duplicate specimens are not available (In 
particular re-using of specimens must be impossible, otherwise simpler 
methods would be more appropriate.) 


The case of known slope 


If the slope of the common relation of y to x is known (e.g. is known 
to be 1.00) from other data, then, as Karl Pearson [12] and Grubbs [7] 
have pointed out, we can estimate the variability of each measuring 
device and of the specimens in the same way as in the last section. 
We may expect to get full value from this procedure 


(1) when one specimen can undergo both measurement but not 
more than once, and when exact duplicate specimens are un- 
obtainable, 


or 


(2) when this information is a byproduct of experiments mainly in- 
tended for another purpose. 


MORE THAN TWO VARIATES 


Three variates with known slope 


The fact that, given three observers with equal response and un- 
correlated orthogonal errors, the variability of individual observers can 
be found without any further knowledge of the quantities observed 
surprised Karl Pearson [12]. So did the fact that his observers had 
correlated errors. This 1902 reference is the earliest I know. The fact 
was rediscovered and extended by Grubbs [7], whose instruments seem 
to have orthogonal errors. This case is native to the field of precision 
of measurements, 


} 
59 
~ 
\ 
. 
wo: 
he ] 


60 BIOMETRICS, MARCH 1951 


Tetrads. 


Closely related is the early work of Spearman leading to the exami- 
nation of tetrads by Burt [4] in 1909 in a way that clearly implied that 
correlations (or as we should prefer, variances and covariances) between 
k variates with linearly related common parts and orthogonal fluctua- 
tions give determination for k = 3 and overdetermination for k > 4. 
(This is explicitly stated, for example, in Dunlap [5] and in Thurstone 
[14, p. 265].) This development was immured behind a wall made of 
coefficients of correlation. 


Econometrics 


The development in econometrics is much later. Of the 27 refer- 
ences in Reiers¢l’s recent review [13], only 2 are dated before 1934, and 
one of these is by a psychologist. Econometrics can claim the credit of 
stimulating Wald and Geary. Perhaps it is the Godmother of the 
instrumental method. 


The corn example again 


Let us come back briefly to the corn example, and compare the 
observed CM SB with a triangle of numbers corresponding to the steady 
parts of the four variates being restricted to a single line in four space— 
restricted so that any one determines the other three linearly. We have, 
rounding to tenths, 


202.8 42.1 35.2 21.0 202.8 43.4 35.7 20.1 
93 83 43 F 93 7.7 4.3 
CMSB = Trial = 
63 36 
22 20 


The fit is clearly quite good, and we can go far toward believing in a 
single genetic parameter (probably controlled by many genes) which 
determines all 4 of crude protein, lysine, methionine, and tryptophane 
in these single crosses. The trial values are based on matched changes 
of 1.00 in w, 0.21 in z, 0.18 in y, and 0.10 in z. Decoding, the matched 
changes would be: 


1% change in crude protein, 
0.021% change in lysine, 
0.018% change in methionine, 
0.010% change in tryptophane. 


To decide whether this 4-dimensional slope fits, and what others do 


‘| 
‘ 
t 
ip 
= 
| 
| 
ag 


COMPONENTS IN REGRESSION 61 
within prescribed risk, would require the use of a three-dimensional 
generalization of the F-test. Except in the case of 1 degree of freedom, 
where Hotelling’s T’ test is surely appropriate, there are various possi- 
bilities (various functions of the latent roots). We shall not carry this 
example further in this paper. 


| Factor analysis 


The example just discussed, although it arises in plant breeding, is 
very close to the typical factor analytical problems of psychology. In 
Spearman’s original language, we asked if a ‘‘g’’ would suffice. More 
typical are problems with many more variates (many tests) and several 
factors. The first stage of the factor analysis is to boil things down 
to a suitable number of factors. There has been considerable discussion 
among factor analysts of what to boil down (How to estimate com- 
munalities? or do you neglect them?). From our present point of view, 
it is clear that we wish to boil down a variance component, and hence 
that we should work on a CMS, since that is the natural estimate of a 
variance component. 

In our example the plant scientist was careful to set up a clear, well- 
defined situation in which a CMS could be isolated that corresponds to 
the variance component that is of interest. He has arranged his ex- 
periment so that he can get at what he wants. If he had been in- 
terested in protein-amino acid relations over a wide range of soils and 
weather conditions, he would undoubtedly have set up his experiment 
at a number of places in a number of years. He would have set it up 
so that a CMS for varieties (single-crosses) could have been obtained 
which was on the average free of place-variety and year-variety inter- 
actions, just as in the actual experiment the CMS between was on the 
average free of errors of sampling and analysis. 

If the psychologist was equally careful, he would conduct his experi- 
ments rather differently than any I have seen. Let us suppose that 
we have three tests A, Band C. Their “reliability”? was probably de- 
termined by “split-halves”, that is, by comparing scores on the odd 
and even-numbered questions. This procedure inevitably forces fatigue 
effects and day-to-day differences of mental whatsoever to appear in the 
reliable component and not in the unreliable one. This is well-known. 
But we are interested in stable aspects of the subject, stable over long 
periods like a month. So we shall actually split these tests and have 
tests Al, A2, B1, B2, C1, C2. Now the order of taking tests has effects 
of unknown magnitude, and so do day-to-day differences common to 
many students (a big basketball game or dance the night before, for 


— 
oly 
aS 
"A 
74 
| 
& 


62 BIOMETRICS, MARCH 1951 


example). Thus we will wish to give the tests in many orders. The 
following set of 18 orders, where it is assumed that two tests can be 
given in immediate succession on each of three days, provides a large 
measure of balance, and the possibility of isolating many factors. It is 
probably far from the best, but is an illustration. 


Ist Al A2 Cl C2 Bl B2 Al B2*| Bi A2* 
day A2 Al C2 Cl B2 Bi B2 Al | A2 Bl 


2nd Bl B2 Al A2 Cl C2 Bl C2 | Cl B2 
day B2 Bil A2 Al C2 Cl C2 Bi | B2 Cl 


3rd Cl C2 Bi B2 Al A2 C1 A2 | Al C2 
day C2 Cl B2 Bi A2 Al A2 Ci | C2 Al 


*This represents 4 columns formed from the 2 given by rotation through the days. 


Now we need detailed testing experience. Can we give 6 equivalent 
tests in 18 orders to 18 students in a single class without too much 
leakage? (If not, some other design using 18 orders could be used!) 
If we can, then we would naturally make up blocks of 18 students, one 
or two blocks to a class, trying to make the blocks as educationally 
homogeneous as possible, and alloting them to the 18 orders at random. 
Then our analysis would fall into classifications of order (which includes 
the date of each individual test), block, and person. If we had obtained 
k blocks, we should have the skeleton 


DF MS 
Blocks k-1 ? joint analysis on" 
Orders 17 ? six variates 
Persons 17(k — 1) sf (split-tests) 


The mean square (6 mean squares and 15 mean products if you count 
individual numbers) marked * is free of the effects of blocks and order. 
If we compare Al with A2, B1 with B2, and C1 with C2 within this 
mean square, and find them equivalent enough, then we can think of 
Al and A2, for example, as two applications of test A. We now have, 
by a little addition and subtraction, an analysis of variance for the 
three tests given twice. This has the skeleton 


| 
“ay 
| ‘ 
j 


COMPONENTS IN REGRESSION 


DF MS CMS 
Persons* 17(k — 1) ? ” 
Replication* 17(k — 1) bs r Joint analysis 


on 3 tests 


*After allowance for orders and blocks. 


| The CMS marked ** is now suitable material for a factor analysis. A 
similar rearrangement of the mean squares for orders and blocks in the 
first analysis would be advisable as a check on possible complexities. 

The simplicity of the rearrangement can be shown by a numerical 
case for 2 tests each split once. Suppose the first MS was 


(Al) j 15 12 4 2 
(A2) 17 5 3 
(B1) 27 20 
(B2) 24 
If we rotate to 
(Al + A2) j 56 —2 14 4 
(Al — A2)) 8 —2 0 
(B1 + B2) er 3 
(B1 — B2) (11 
where 56 = 15 + 17 + 2(12), 8 = 15 + 17 — 2(12), 
3 = 27 — 24, 


0=44+3-5- 2, 
4=5+4-3 — 2, etc. 


then we see that we can preserve most of these numbers in the desired 


form 
MS CMS 
Persons (A) 56 14 24 7 
(B) 91 40 
Replication (A) 8 0 8 0 
(B) 11 11 


and that we essentially assume that the 4 covariances between 


| 
2 


64 BIOMETRICS, MARCH 1951 


(Al + A2) and (B1 + B2) on the one hand and (Al — A2) and (Bl — 
B2) on the other are zero (or nearly so). The final CMS in this example 
is, of course 


48 14 
3{(MS-persons) — (MS-replication)} = 4 
80 


It would be very interesting, and perhaps very enlightening to see the 
results of a few such experiments. 


Weighting component measurements or judgements. 


Another multivariate case which is often passed by with inadequate 
treatment is the problem of weights. Something which we measure 
with error as w, is believed to be dependent on, say 3, variates which 
we measure with error as x, y and z. We wish to know one or both of: 


(a) With what weights do the structural quantities which appear 
disguised by error as x, y, and z appear in the structural quantity 
that appears disguised as w? 


(b) Is a weighted linear combination of the structural quantities 
disguised as x, y, and z sufficient to specify the structural 
quantity disguised as w? 


Now the usual procedure is to apply multiple regression, either to single 
values or to averaged values (averages of x, y, z and w are somewhat less 
disguised by error). This would be appropriate in answering: 


(c) With what weights can we combine 2, y and z to best predict w? 
(d) What will be the residual variance? 


But there are many cases where we want structural information! And 
multiple regression misses the point. 

One example might be a taste-testing procedure where expert judges 
can assess both an overall rating, and several partial ratings. We may 
wish to know how the components enter into the result—not because 
we are going to ask the experts to stop giving an overall rating, not 
because we want to use a compound of partial ratings to predict an 
overall rating—but because we want to tell the producer what the judges 
don’t know themselves—how much weight do they subconsciously sane 
on each partial rating. We want structural information! 

Here we would naturally set up an analysis with judges, dates, sam- 
ples, and, possibly, order within day, as classifications, and begin by look- 


a 
Sires. 
‘ 
4 
bed 
if 
{ 
! 
i 
i 


COMPONENTS IN REGRESSION 65 


ing at the CMS for samples. Most of us can parallel this example in our 
own field of interest. 

The problems of the econometrician are much like this in one way, 
and much different in others. They want structural weights or coeffi- 
cients, and are now boldly facing up to the struggle to obtain them. 
But they cannot replicate, they can group but rarely, they are left 
with the method of regression. This has its dangers, particularly when 
assumed orthogonality (lack of correlation) may be disturbed by such 
things as 


(a) a clerical error common to two series 
(b) a random fluctuation in a price index used to deflate two series. 


(c) the random element contained in individual economic decisions 
(for example, the date when U. 8. Steel decided to build an 
Atlantic Coast plant). 

Has the econometrician given such problems close scrutiny? Has he 
chosen models of adequate complexity and refinement, so that they 
have a hope of fitting, even though the parameters cannot be obtained 
from present data? Or has he chosen models for manageability? Has 
he realized that it is equally arbitrary to assume a certain covariance is 


(a) zero 
(b) 12 X 10° (dollars)? 
and that if (b) is nearer correct than (a) then it is better to assume (b)? 


APPENDIX 


The variability of slopes from components 


We have estimated slopes in components by formulas such as this 


SPB SPW 
__BEW 
CMSB, MSB,— MSW, MSB, — MSW, 

We are going to find an approximate variance for such a slope, making 

approximations which we know will lead to the neglect of marginal 

biases and hence, finally, to an underestimate of the desired variance. 

By writing “| x” we signify that all x’s are to be kept fixed, and we use 
approximations of the form 


var {y | x} = var {y — Bx | x} ~ var {y — Br}. 


We have to deal with linear combinations 


| 
| 
: 


66 BIOMETRICS, MARCH 1951 
ys.) of y..) 
and we observe that the y;; — y;. behave like differences from a popula- 


tion of variance aver{ MSW,} while the y;, — y., behave like differences 
from a population of variance (1/r) aver{MSB,}, so that 


var { — ys.) | 2} = aver {MSW, | x} 


var { — y..) | 2} = c;) aver {MSB, | 2} 
Thus we shall estimate these variances by, respectively, 


(La )MSW,». and e)MSB,_p.. 
We have 


SPB =r (x. = y..)s 


so that, approximately (using the sort of approximation mentioned 
above,) 


var {SPW | x} 


(> — = MSW.MSW,_», 


var {SPB | 2} = — = MSB_MSB,_-», 


while their covariance is approximately zero. 
Thus, approximately 


MSB.MSB,_s, , MSW.MSW,-», 
DFB DFW 
(MSB, — MSW.) 


If we may substitute MSB,_,. for and MSW,-_»,, for 
MSW,-s: , we have the formula used in the body of this paper. In 
most cases, the first term in the numerator will entirely dominate the 
second term, so that the result of this substitution is, for all x fixed, 
essentially a mean square on DFB degrees of freedom. Thus we are 
led to a reasonable F test, which we have used. In the special case, 
however, where DFB = 1, we have MSB,_», = a,(B — B,)’, and it 
seems unwise to attempt to use such a test to set limits. 


The modified Bross limits 


If we adopt Bross’s notation of 


var {slope y on x | x} = 


i 
& 
‘ 
ie 
1 
a 
4 | 
“Ub 
- 
: 
| 
| 
2 
& 


COMPONENTS IN REGRESSION 67 


MSB 
MSW 
F os = 1/(tabular 5% point of F for DFW and DFB) 
F’,; = 1/(tabular 5% point of F for @ and DFB) 
his upper limit of ~ 
1 1 
cm sp = — 1) MSY 


(where the average value of MSB is o”? + ro;) was chosen to meet the 
limiting conditions 


limiting value when 
DFW and 
CMSB = 1) thus F Fos 
0 F = F 5 


Its general behavior is perhaps better seen from the factored form 


(F = os)(F 1) MSW 
(F .95/ Fos) 


Since F’,;/F .o, # 1, we obtain a negative upper limit when F lies 
between F,;/F’,; and 1 and rapidly changing values nearby. This is 
obviously wrong. 

The modified form 


MSW 
(F F TF’ os 


has the correct limiting values in all three limiting conditions and 

avoids this difficulty. It is clearly a better approximate bound for 

small F. We may write it out more fully as : 
1 MSW 4 

r/F;%(°, DFB) 


MSW  F,%(DFW, DFB) 


a 


68 BIOMETRICS, MARCH 1951 


MSW 
r/Fs%( ©, DFB) 


which is the form used in the body of the paper when F < F,%,(DFB, 
DFW) 


The +o approximation 


It is well known to all that if x has any respectable distribution, and 
f(x) is a linear function of z, that we can find the average of f(x) by 
substituting the average, say u, of x for x. It is less well known that, 
if f(x) is a quadratic in x, we can find the average of f(x) by substituting 
+o for x and averaging the two results—thus 


average {f(x)} = 3{f(u+o)+flu—o)} 


for all quadratic f’s (including, of course, all linear ones). 

In many situations we have to calculate from observations some 
quantity which also depends on an unknown parameter. If this de- 
pendence is linear, and if we have an independent unbiased estimate of 
the parameter, we will be correct on the average if we substitute the 
estimate for the unknown parameter. But if the dependence is non- 
linear, this need not be the case. If the dependence were quadratic, 
and if we knew the variance of our estimate, we should be correct on 
the average if we substituted “estimate + standard deviation” for the 
unknown parameter and averaged the two results. 

When we have an estimate of an unknown parameter, and only an 
estimate of its variance, it will still often pay us to follow the same 
procedure, and substitute 


estimate + estimated standard deviation. 


This is the +o approximation in its simplest form. 

We frequently face a situation where the estimated variance of an 
estimate of a parameter depends on the true value of that parameter. 
In such a case the logical procedure is to compare 


(estimate) — (trial value) 
with 


(estimated variance if trial value correct)? 


and judge the significance accordingly. Thus in setting limits we will 
find the trial value on both sides of the equation. 
In view of this, the natural form of the +o approximation, when 


| 
Ve 
n 
| 
A! 
a 
: 
§ 
A 
iss 
ae 
= 


COMPONENTS IN REGRESSION 69 


the variance of the estimate of the parameter depends on the true 
value of the parameter, is to substitute the solutions of 


(estimate) — (trial value) = -+(var. if trial value correct)? 


and average the two results. This is what we have done in setting 
limits for the variance of substitution (of calibrated x for y). 


REFERENCES 


{1] M.S. Bartlett, ‘Fitting a straight line when both variables are subject to error’, 
Biometrics 5, 207-212, 1949. 

{2] Joseph Berkson, “‘Are there two regressions”, Jour. Am. Stat. Assn., 45, 164-180, 
1950. 

[3] Irwin Bross, ‘‘Fiducial intervals for variance components’’, Biometrics 6, 136- 
144, 1950. 

{4] Cyril Burt, “Experimental tests of general intelligence’ Brit. Jour. Psych. 3, 
94-177, 1909-10. 

[5] Jack W. Dunlap, “Comparable tests and reliability” Jour. Educ. Psych. 24, 
442-253, 1933. 

[6] R. C. Geary, ‘Determination of linear relations between systematic parts of 
variables with errors of observation the variances of which are unknown”, 
Econometrica 17, 30-58, 1949. 

[7] Frank E. Grubbs, “On estimating precision of measuring instruments and 
product variability”, Jour. Am. Stat. Assn. 43, 243-264, 1948. 

(8] J. Hemelrijk, “(Construction of a confidence region for a line”, Proc. Kon. Ned. 
Akad. v. Wet. (Amsterdam) 52, 995-1005, 1949. 

{9] D. V. Lindley, “Regression lines and the linear functional relationship” Suppl. 
J. Roy. Stat. Soc. 9, 218-244, 1947. 

{10} R. C. Miller, L. W. Aurand, and W. R. Flach, “Amino acids in high and low 
protein corn’’, Science 112, 57-58, 1950. 

{11] K. R. Nair, and M. P. Shrivastava, “On a simple method of curve fitting” 
Sankhya 6, 121, 1942-3. 

[12] Karl Pearson, ‘‘On the mathematical theory of errors of judgment, with special 
reference to the personal equation’’, Early Statistical Papers (1948) 377-441, 
Cambridge University Press (reprinted from Phil. Trans. of Roy. Soc. London 
A198, 235-299, 1902. 

[13] Olav Reiersgl, “Identifiability of a linear relation between variables which are 
subject to error’ Econometrica 18, 375-389, 1950. 

{14] L. L. Thurstone, “Multiple-factor analysis”, Univ. of Chicago Press, 1947. 

{15] Richard K, Toner, Carol F. Bowen, and John'C. Whitwell, ““Moisture deter- 
mination in textiles by electrical meters—Part 1”’ Textile Research Journal 18, 
526-535, 1948. ‘Part 2” Textile Research Journal 19, 1-8, 1949. 

[16] John W. Tukey, “Dyadic anova, an analysis of variance for vectors” Human 
Biology 21, 65-110, 1949. 

[17] John W. Tukey, ‘Comparing individual means in the analysis of variance” 
Biometrics 5, 99-114, 1949. 

{18] Abraham Wald, “The fitting of straight lines if both variables are subject to 
error’, Annals Math. Stat. 11, 284-300, 1940. 

{19] John C. Whitwell, “Estimating precision of textile instruments”, Biometrics 7, 
102-112, 1951. 


| 
ing 
= 


ANALYSIS OF VARIANCE WITH UNEQUAL BUT 
PROPORTIONATE NUMBERS OF OBSERVATIONS 
IN THE SUB-CLASSES OF A TWO-WAY 
CLASSIFICATION 


- H. Smita 


Institute of Statistics 
North Carolina State College 


aoe OF VARIANCE is used to provide the solution to two more 
or less distinct problems: (1) to detect and estimate components of 
variance in a composite population, (2) to detect and evaluate the 
significance of differences among means of sub-sets (Eisenhart, 1947). 
Attention has been drawn to complexities and some unsolved problems 
when there are disproportionate numbers of observations in each sub- 
class of a multiple classification. But the case of proportionate sub- 
class frequencies is usually passed over in a way which may lead the 
unwary to suppose that it has the same simplicity as when all sub-class 
means have equal weight. The purpose of this note is to call attention 
to the condition that such supposition is incorrect. 

When the sub-class numbers, although unequal, are proportionate 
to their marginal totals, it is well known that the additive property of 
sums of squares still holds good, and the analysis of variance can be 
carried through in the usual way. Beyond this it is generally implied 
although never explicitly stated, that interpretation as well as arith- 
metical procedure follows the usual lines; for example, Snedecor (1946), 
Sec. 11.9, writes, “. . . causing no injury to the analysis of variance”. 
The implication however requires qualification (1) when the problem is 
to estimate components of variance, and (2) for tests of significance in 
which the within class variance fails to provide the appropriate estimate 
for error. 

Suppose that there are p classes A, , A, --- A, , with numbers of 
observations proportional to a, , a2 --- a, ; and q classes B, , B, --- 


70 


a: 
; 


UNEQUAL NUMBERS OF OBSERVATIONS 71 


B, , with numbers of observations proportional to b, , b: «++ b,. The 
sub-class frequencies and numbers of observations in marginal totals may 
be represented by Table 1, which is quite general since any common 
factor can be absorbed into one of the sets of proportion constants. 


TABLE 1 
Class B, B, res Total 
Ai a,b, abe a,Sb 
Ag *** Arbg 
Ap yb, *** Apbg apSb 
Total +++ SSab = (Sa)(Sb) = N 
Let S’aa’ indicate summation over all p(p — 1) permutations of pairs of a 


= 28,S;-a,a; where i < 7’ = (Sa)? — Sa?; and similarly for S’bb’. 
Xap = the sum of ab observations in each sub-class 
Xa, = the sum of a(Sb) observations in each A row 
X., = the sum of b(Sa) observations in each B column 
C = (SSXas)?/N = the “correction for the mean”’. 


Problem (1): To estimate components of variance it is supposed that 
each observation may be described by 


y=uta;,+ B; + (a8); + € 


where the symbols have the usual connotations and variances as if for 
infinite populations (Eisenhart’s Model II. If population of a, 6 are 
finite, compare comments by Hendricks, 1951). Then the sums of 
squares of the analysis of variance (calculated as in col. 3) have expecta- 
tions as in col. 4 of Table 2, if the numbers of observations in each sub- 
class are independent of the means. 

The coefficient of V ., in the sub-total line is the well known formula 
for one-way classification into pq classes with variable numbers (ab); 
the coefficients of V, and of Vz, are, as one would expect, of similar 
form. The coefficients of V., in other terms may be expressed in 
several ways, e.g. for the A classes: 


(S’aa’)(‘Sb*) _ N(Sb*) _ SS(ab)* _ Sa { 


N (Sb)? N (Sa)? 


However, the form given in Table 2 seems best to show the structure 


1s 
: 
f 
| 


BIOMETRICS, MARCH 1951 


72 


"A(t — 5) + (2 ~ as )(os) 1-5 
"A(t — + + (2 ~ \(as) 1-4 
(sorenbg jo sung) yp 
@ 


} 
he 
4 
4 4 
| 
| 
7 


UNEQUAL NUMBERS OF OBSERVATIONS 73 


when it is remembered that S implies p or q terms according as it 
operates on a or b, and S’ implies p(p — 1) or q(q — 1) terms respectively. 
These coefficients are proportional to the degrees of freedom (making 
the respective coefficients in the mean squares equal) only if a, = ad. = 

- = a, and b, = --- = b,. Therefore to estimate V, or Vg it is 
not sufficient simply to subtract the interaction mean square from the 
respective class mean square. A linear function of three rows of the 
table is required. 

If one set of proportion factors is constant, say a, = ++: = a, 
then the coefficients of Vg are equal in the mean square for interaction 
and in the mean square between classes B which have unequal numbers. 


Problem (2): Tests of significance: In most examples of unequal class 
frequencies described in the literature it has been assumed that the 
within-sub-class mean square gives the appropriate estimate of error 
variance. Then demonstration of a significant interaction implies effects 
of the main classes whatever their own mean squares may say. If 
interaction is not significant further tests are made on assumption that 
it is zero—which may or may not be justified. But it is just as likely, 
with unequal as with equal sub-class frequencies, that experimental 
error (or random variations to be averaged in evaluating treatment 
means) may be compounded of variation between as well as within 
sub-classes. (The appropriate error for a variety of cases is discussed 
by Snedecor, 1946, Sec. 11.8.) Since Sa? > S’aa’/(p — 1); when a is 
variable the B class mean square will be increased relative to the inter- 
action mean square by a multiple of the interaction variance (Vs) as 
well as by variance ascribable to class means (Vg), and conversely for 
b and A classes. 
To abbreviate the formulae, let 


6, = E(Mean square between classes) = k,V. + k2Vag + Vo 


6, = E(Mean square for interaction) = k3Vas + Vo 
6, = E(Within sub-class variance) — Vo 
> ks 


Then the test of significance for existence of V, implies testing the 


hypothesis 
_ ke {ls 
6. + 176 = 0 


which is discussed by Cochran (1951). 


ig 
| 
| 

, 

¢ 


74 BIOMETRICS, MARCH 1951 


Since this note was written (November, 1948, in course of work for 
the Rubber Research Institute of Malaya) Cochran and Cox (1950, 
Sec. 14.31) have noted the problem and given the same solution as 
above incidental to a special case, viz. when b is constant and may be 
put equal to unity so that a; = their r; = the number in each sub-class. 

Work which breught this matter forcibly to my attention contains 
other complexities which make it clumsy for illustration; so an example 
is taken from the literature. Snedecor (1934) in his Table II gives the 
analysis which is reproduced in Table 3 with addition of expectations 
of the mean squares. 


TABLE 3 
d.f. M.S8q. E(M.8q.) 
Between years 2 21.9 355.04V, + 44.864V,, + Vo 
Between pens 7 1.97 145.65V, + 57.027Vyp + Vo 
Interaction 14 1.11 44.310Vyp + Vo 
Within sub-classes 1148 .567 Vo 


From these we estimate 
V,, = .012254; 57.027V,, + Vo = 1.266; 


indicating that the interaction mean square under-estimates by 14 per 
cent the variance appropriate to testing persistence of pen differences’ 
over many years. 


REFERENCES 


Cochran, W. G. Biometrics 7: 17-32, 1951. 

Cochran, W. G. and Cox, Gertrude M. Experimental Designs, J. Wiley & Sons, Inc., 
New York, 1950. 

Eisenhart, C. Biometrics 3:1-21, 1947. 

Hendricks, W. Biometrics 7:97-101, 1951. 

Snedecor, G. W., J. Am. Stat. Assoc. 29: 389-393, 1934. 

Snedecor, G. W. Statistical Methods. Iowa State College Press, 2nd Ed, 1946. 


ote 
hes 
yee 
4 
¥ 
ode dis 
me 
|| 


CONSISTENCY OF ESTIMATES OF VARIANCE COMPONENTS 


R. E. Comstock anp H. F. Rosinson* 


Institute of Statistics, North Carolina State College 


ESTIMATION OF VARIANCE COMPONENTS has important applica- 
tions in genetic research, and geneticists were among the first to 
use analysis of variance extensively for that purpose. Early exampies 
of component estimation in genetics are contained in papers by Lush 
et al. (1934), Bywaters (1937), and Stonaker and Lush (1942). Numerous 
later papers could also be cited. 

The earlier component estimates in the genetic literature were not 
accompanied by measures Of reliability or confidence limits of any sort. 
The first attempt to place confidence limits were made by Knapp and 
Nordskog (1946) in connection with estimation of the ratio of genetic to 
total variance in beef cattle. 

Research on the genetics of quantitative on of economic 
plants now in progress at the North Carolina Experiment Station relies 
heavily on variance component estimation. In designing our experi- 
ments, prediction of the sampling variance of component estimates has 
been predicated on normal distribution theory. Whether this is satis- 
factory depends on conformity of observed distributions of component 
estimates with assumptions of normality and homogeneity of variances 
throughout the experimental material. A preliminary investigation of 
the problem has been made using data collected for estimation of 
genetic variance components in corn. 


DESCRIPTION OF THE EXPERIMENT 


The experiment, described in greater detail by Robinson et al (1949), 
was composed of distinct units. The majority of the units were identical; 
it is from these that data for this study were taken. The plant material 


*Presented before the American Statistical Association annual meeting in Chicago, 1950 at 
sessions held jointly by the Biometrics Section of the American Statistical Association and The Bio- 
metric Society (ENAR). 


75 


| 
obs 
af 
1 
kes 


76 BIOMETRICS, MARCH 1951 


for a single unit was a set of 16 full-sib families of corn. The 16 families 
were in 4 groups of 4 each. In any one group, the staminate, i.e. the 
male, parent was the same for all families; but no two groups had the 
same male parent. Each family had a different pistillate (female) 
parent. A unit of the experiment was a two replicate, randomized 
block comparison of the 16 families of such a set. 

Complete data on several plant characters were obtained from 34 of 
these experimental units. The plant material belonged to three separate 
but similar populations. Within each of these populations the choice 
of parents for the families was conducted in a manner believed to be 
random. Because of the similarity of the populations and also because 
it makes no essential difference in findings, the data are treated in what 
follows as though all material were a random sample from a single 
population. 


TABLE 1 
COMBINED ANALYSIS* OF DATA FROM ALL 34 UNITS 
Variance Source df. M.S. Exp. of m.s. 
Units 33 
Replications in units 34 
Males in units 102 .0214 o? + Qos? + 80m? 
Females in males 408 .0102 o? + 2oy;? 
in units 
Remainder in units 510 .0036 o? 
Plants in plots 748 .0021 Ou? 
o? = + op? where 
op? = variance due to plot effects 
n = number of plants per plot 


*Except for the last line of the table, computations were based on plot means. 


Table 1 shows the sort of analysis made as the basis for estimation 
of variance components arising (a) from genetic differences among male 
parents, and (b) from genetic differences among female parents. This 
analysis combines the information on yield from all 34 units. It should 
be noted here that the variance of plants within plots was estimated 
from data collected in only about one-tenth the total number of plots. 


TREATMENT OF DATA AND DISCUSSION OF RESULTS 


Data on three characters were used. Correlations (each component 


ae 
licks 
i 
i 
| 


ESTIMATES OF VARIANCE COMPONENTS 77 


of variance considered separately) among these three characters were 
low, indicating that independent information would be provided by the 
three sets of data. 


A variance analysis of the form shown in Table 2 was wine 


TABLE 2 
ANALYSIS OF VARIANCE FOR A SINGLE UNIT OF THE EXPERIMENT 
Variance Source df. M.S. Exp. of m.s. 
Replications 
Males 3 Mn ao? + + 8am? 
Females in males 12 M; + 
Error 15 M, o 


from the data of each unit of the experiment. Each of these analyses 
provided an estimate of (1) M,, , the male mean square, (2) M, , the 
female mean square, (3) M, , the residual mean square, (4) o, , the 
variance component arising from genetic differences among males, and 
(5) o; , the variance component from genetic differences among females. 
In addition, M,, , the mean square for intra-plot plant variation was 


computed separately for all plots on which individual plant data had 
been collected. 


Variance of Estimates. When sampling is random and all contributing 
effects are normally distributed, the variance of any mean square, M, is 


2 _ 2[E(M)/ 
ox = F (1) 


where E(M) symbolizes the expectation of the mean square and F, the 
degrees of freedom on which the mean square is based. Then if a 
component estimate is computed from two independent mean squares 
as (M, — M.,)/k, its variance must be 


+ -| “|/ k (2) 


Of concern is whether actual variances of the mean squares and com- 
ponent estimates were in accord with these formulae. (1) and (2) were 
used for estimation of the standard deviations of each of the six statistics, 
the average of all sample values of a mean square being substituted for 
its expectation. The exact computation for each is indicated in Table 3. 


| 
| 
| 
= 


78 BIOMETRICS, MARCH 1951 
TABLE 3 


ESTIMATION OF STANDARD DEVIATIONS (s’) FROM 
MEAN VALUES OF THE STATISTICS 


— 2 — 2 
te. = He sy, = M, 


— [2 — 
sy, = M, = Mae 


1 /2(M.)* , 2(M,)? 
= +2 


The values obtained (symbolized as s’) are listed in Table 4, together 
with estimates computed from the observed variation in each series of 
statistics (symbolized as s). 


TABLE 4 


ESTIMATED STANDARD DEVIATIONS OF MEAN SQUARES AND 
VARIANCE COMPONENTS 


Plant Height Ear Diameter Yield 
M, 185 164 18.1 7.0 11.4 9.2 
M, 370 294 23.9 11.4 15.2 13.1 
M; 1,475 1,329 34.6 28.8 50.8 41.8 
Ma 9,275 10,223 112.2 112.7 163.3 175.1 
6;? 764 681 17.0 15.5 23.0 21.9 
bm? 1,148 1,288 14.1 14.6 19.5 22.5 


The two estimates of the standard deviations are for the most 
part in good agreement. The picture is much the same for all three 
characters, s being larger than s’ for M, , M. , M, and 4; ; a little 
smaller than s’ for M,, and ¢; . However, the differences are all small 
with the exception of those for M, and M, of ear diameter. The ex- 
planation for these exceptions probably is that in collecting data the 
operational definition of an ear was based on the presence of one or 


= 
| 
4 
pia 
f 
AS be 
te 
fp 
| 
4 
| 
4 
| 
4 


ESTIMATES OF VARIANCE COMPONENTS 79 


more kernels of corn. This meant that an occasional measurement was 
taken on an almost bare cob. The presence of one such measurement 
among the ten taken per plot resulted in a great increase of the plant 
variance for the plot concerned. The sporadic distribution among 
plots of a few such ears would cause an increase in the standard devia- 
tion of the within plot variances not reflected to the same degree in the 
mean of those variances. Note, however, that this departure from 
normal distribution of plant to plant variation appears not to have 
caused much disturbance in the distribution of mean squares, M, and 
M,, , which have in them considerable variance from sources other 
than that among individual plants. 


Distribution of Estimates. Table 5 shows the distribution of estimates 
of five of the statistics being discussed. The class intervals are in 


TABLE 5 


DISTRIBUTIONS OF STATISTICS 
CLASS CENTERS (STANDARD MEASURE) 


Statistic —2.5 -1.5 —.5 .5 1.5 2.5 3.5 
M, 1 12 49 22 12 5 
M; 13 47 23 14 3 1 
Gy ll 46 26 14 3 
Mn 5 60 21 9 6 
bm? 12 50 24 10 5 
Given: 


o = E(M,), = E(M:) 


N 


Then 
4 
= ou, + ou, 
1 2 
It 
om, tou, 4F,+ F, 


If F, = F:, this ratio becomes .8; and if F, < F2, the ratio is still larger. 


standard measure based on the directly computed estimates of the 
standard deviations (those symbolized by s in Table 4). Results for 
the three characters are pooled in a single distribution for each mean 
square or variance. This is appropriate in the case of M, , M, , and 
M,, since when these are expressed in standard measure their distribu- 
tions should vary only with degrees of freedom for the individual 


i 
12 
vie 
a 
: 
| 
2 
gas « 


80 BIOMETRICS, MARCH 1951 


estimate. Degrees of freedom varied for the three statistics but not 
among characters. On the other hand, the justification for pooling 
might be questioned for é; and ¢, since the ratios of the mean squares 
involved in their estimation were not constant for all three characters 
and therefore the distributions of these component estimates should 
differ slightly between characters. However, since the distribution 
differences to be expected would appear to be rather small in view of 
the actual differences in mean square ratios the pooling was done. 
Only by doing so are the numbers made sufficient to get a reasonable 
idea of the distributions. 

It will be noted that all five of the distributions are of the general 
nature expected. Those for VM, , M, , and M,, should be of the chi- 
square form with greatest skewness in the distribution of M,, since here 
the degrees of freedom for individual estimates are only three as com- 
pared with 12 and 15, respectively, in the case of M, and M,. 

Note also that the distributions of ¢% and é; are very similar to 
those for M,, and M, respectively. This is to be anticipated since if 
the larger of two mean squares involved in the estimation of a variance 
component has an expectation as large or larger than double that of 
the smaller one (as was true throughout our data) the variance of the 
component estimate will be dominated by that of the larger mean 
square, even if degrees of freedom for estimation of the two mean 
squares are equal. If degrees of freedom are fewer for the larger mean 
square it will contribute a still larger fraction of the variance of the 
component estimate. These points are demonstrated at the bottom of 
the table. 


Fiducial Limits. Approximate fiducial limits were computed, in the 
manner suggested by Bross (1950), for all estimates of both of and 
o,- Results in terms of the number of times the limits failed to include 
the mean of all estimates for the component and character in question 
are presented in Table 6. The number of such failures is 17 out of a 
total of 202 cases. Since it was 90% limits that were computed the 
number expected to fail to include the actual value of the parameter 
estimated is 20. The agreement seems entirely reasonable. 

The last two columns of Table 6 give results obtained with what 
Bross referred to as normal approximation limits. These were set at 
two standard deviations on either side of the estimate. Standard 
deviations were estimated as indicated in Table 3, except that indi- 
vidual sample values of mean squares were used where means of all 
sample values are indicated in Table 3. Thus the practical situation, 
in which estimate and limits are computed from the same sample daia, 


th 
hae 
= 
| 
4 
4 


ESTIMATES OF VARIANCE COMPONENTS 81 


TABLE 6 


FAILURES OF FIDUCIAL LIMITS ABOUT INDIVIDUAL ESTIMATES 
TO INCLUDE THE MEAN OF ALL ESTIMATES 


Type of Limit 
Character Statistic Bross Fiducial Normal Approx. 
Upper limit | Lower limit | Upper limit | Lower limit 
too low too high too low too high 

Yield 6;? 3 1 4 0 
bm? 1 0 

Ear Diam. G,? 1 1 5 0 
Om? 2 

Height és? 2 2 3 0 
6m? 2 1 

Totals 1 6 12 0 


was simulated. Results are given only for é; ; it seemed pointless to 
compute this type of limits for ¢% in view of the very small number of 
degrees of freedom for the between males mean square. Here there are 
12 failures to include the mean out of a total of 101 cases. This is consid- 
erably more than 5%, the proportion expected within two standard devi- 
ations of a normally distributed estimate. Moreover all the failures are 
of the one type, those in which the upper limit set was smaller than the 
mean of all estimates. These results were, of course, to be expected since 
the component estimates were computed from mean squares based on 
few degrees of freedom. It should be noted that for this reason the 
individual component estimates are subject to large sampling errors. On 
the other hand with more degrees of freedom, greater symmetry of dis- 
tribution, as well as reduced sampling errors, of component estimates is 
obtained. Normal approximation limits then become much more satis- 
factory. In fact we are inclined to the view that when there is enough 
data so that component estimates are sufficiently precise for our pur- 
poses, the normal approximation limits are adequate. We suspect this 
will be true in many other fields of work. 


SUMMARY 


The examination of these data revealed what seems to the authors 
to be satisfactory agreement with expectations assuming normal and 
random distribution of primary variables. The agreement is of three 
sorts. 


: 
, 
= 
_ 
~ 
| 


82 BIOMETRICS, MARCH 1951 


(1) The observed standard deviation of mean squares and variance 
component estimates agreed well with values derived from mean 
values of pertinent mean squares. 

(2) The form of distributions of estimates were of the sort expected. 

(3) Approximate fiducial limits as suggested by Bross gave satis- 
factory results. 

It is quite obvious that these data cannot speak for other sorts of 
data. It is recognized, in addition, that the volume of data was not 
sufficient for the findings to be very critical. At the same time the 
general agreement with expectations is very encouraging to the authors 
and has increased our confidence in assumptions of normality and ran- 
domness as a practical working basis for design of our experiments. In 
this connection, however, it is proper to emphasize that results obtained 
from an experiment of the type described may suffer from shortcomings 
other than the random sampling variation examined. In particular the 
true value of a specific variance component may change in response to 
environment. It is quite possible that the underlying genetic variation 
in a specified population, while itself unchanged, would be the source 
of considerably more phenotypic or actual physical variation in one 
year or location than in another. Thus a given estimate may be very 
good for the situation in which it is obtained but incorrect for other 
situations. This implies that conclusions about the size of genetic 
variance components should be based on the average of estimates from 
experiments conducted in different locations and years. At the present 
time there can be no great certainty concerning optimum distribution 
of effort in time and space. This can come only with the accumulation 
of adequate data. However, the design of work for obtaining such data 
benefits from every increase in knowledge about the effects of random 
sampling in the individual experiment. 


REFERENCES 


Bross, Irwin. ‘‘Fiducial Intervals for Variance Components”, Biometrics 6:136-144, 
1950. 

Bywaters, J. H. ‘“The Hereditary and Environmental Portions of the Variance in 
Weaning Weights of Poland-China Pigs’’, Genetics 22: 457-468, 1937. 

Knapp, Bradford, Jr. and Arne W. Nordskog. ‘Heritability of Growth and Efficiency 
of Beef Cattle’, J. Anim. Sci. 5:62-70, 1946. 

Lush, J. L., H. O. Hetzer, and C. C. Culbertson. ‘‘Factors Affecting Birth Weights of 
Swine’’, Genetics 19:329, 1934. 

Robinson, H. F., R. E. Comstock, and P. H. Harvey. “Estimates of Heritability 
and the Degree of Dominance in Corn’”’, Agron. Jour. 41:353-359, 1949. 

Stonaker, H. H. and J. L. Lush. “Heritability of Conformation in Poland-China 
Swine as Evaluated by Scoring”, J. Anim. Sci. 1:99-105, 1942. 


rT 
‘igi 
4 
| 
i | 
1 
i 3 4 
qi 
| 
ad 


THE USE OF COMPONENTS OF VARIANCE IN PREPARING 
SCHEDULES FOR SAMPLING OF BALED WOOL 


J. M CameErRon* 


Statistical Engineering Laboratory, 
National Bureau of Standards, Washington 25, D. C. 


INTRODUCTION 


k** WOOL contains varying amounts of grease, dirt and foreign ma- 
terial which must be removed before manufacturing begins. The 
purchase price and customs levy of a shipment are based on the actual 
amount of wool present, i.e. on the amount of wool present after 
thorough cleaning—the ‘‘clean content’’. The clean content is expressed 
as the percentage the weight of the clean wool is of the original weight 
of the raw wool [8]. 

Until recently, estimates of the clean content of a shipment of wool 
were made by visual and manual inspection by experienced wool hand- 
lers. The accuracy and precision of such estimates vary from indi- 
vidua! to individual since they are dependent upon the skill and 
experience of the individual observers. Recently an electrical core 
boring machine [8] has come into use for taking cores of about 1/4 
pound from a bale. With this technique it is possible to composite 
several cores from each of a number of bales to obtain a sample which 
is then subject to laboratory analysis. The accuracy of the method has 
been checked by comparing sample values with scouring mill results 
on the lots from which the samples were drawn. It is generally accepted 
that the method is accurate [9]. 

A shipment or lot of wool is not a homogeneous aggregate. Values 
of clean content from different bales exhibit considerable variability 
especially if the lot is made up from the output of several small pro- 
ducers or if no processing or grading has been done prior to packaging. 
Results from cores taken from the same bale similarly display diversity. 
Less uniform values result from cores from wools which are not proc- 
essed prior to packaging or which are made up by compositing several 
producers outputs prior to packaging. 

In order to evaluate a particular sampling scheme or construct a 
sampling schedule measures of the magnitude of the between bale and 
within bale variability must be known. The wool industry is at present 


*Presented before the American Statistical Association annual meeting in Chicago, 1950 at 
sessions held jointly by the Biometrics Section of the American Statistical Association and The Bio- 
metric Society (ENAR). 


83 


~~ 
ON 
: 
re 
fr 
| 


84 


BIOMETRICS, MARCH 1951 


TABLE I 


PERCENT CLEAN CONTENT OF k = 4 CORES FROM EACH OF n = 7 BALES OF URU- 

GUAYAN WOOL, WITH THE ESTIMATES OF BETWEEN AND WITHIN BALE COM- 

PONENTS OF VARIANCE. (DATA FROM CUSTOMS LABORATORY BOSTON THROUGH 
THE PERMISSION OF MR. LOUIS TANNER). 


Bale 1 Bale 2 Bale 3 Bale 4 Bale 5 Bale 6 Bale 7 


Corel | 52.33 56.99 54.64 54.90 59.89 57.76 60.27 
Core2 | 56.26 58.69 57.48 60.08 57.76 59.68 60.30 
Core 3 | 62.86 58.20 59.29 58:72 60.26 59.58 61.09 
Core 4 | 50.46 57.35 57.51 55.61 57.53 58.08 61.45 


Bale 
Ave. 55.48 57.81 57.23 57.33 58.86 58.78 60.78 


1. Analysis of variance 


Degrees of Mean Expected value of 
freedom square mean square 
Between Bales n—-1=6 B = 10.9988 | ow? + kop? = oy? + 40,2 
Within Bales n(k — 1) = 21 W = 6.2606 | o,? 


2. Estimates of components of variance 


Estimate of within bale component: ¢,? = W = 6.261 
Estimate of between bale component: 


(10.9938 


3. Theoretical variance of the estimates 


= 


20%, 
nk — 1) 
4. Estimates of the variance of the estimates {2} 


2Ww* _ 2(6.261)° 
Est. V(é.) = 23 


Ww? 
Est. V(é) = —1)+2 n+ i} 


~ 16\ 23 8 


= 3.408 


= 2.102 


3 4 2 
| 
{ 
ag 
7 
| 


SAMPLING OF BALED WOOL 85 


collecting data on these components of variance for wool of different 
types and origins. 

This paper discusses methods of estimating components of variance 
and shows how a sampling schedule with given characteristics can be 
worked out if values for the components of variance are known. 


METHODS OF ESTIMATING COMPONENTS OF VARIANCE. 


1. Analysis of various technique. The most straightforward tech- 
nique for estimating the between and within components of variance is 
the analysis of variance of a sample of k cores from each of n bales 
selected at random from a lot, the n bales being regarded as a random 
sample from the population of wool being studied. An example of such 
a set of data is shown in Table I along with the method of estimating 
the between bale and within bale components of variance. For refer- 
ence the usual formulas for the variance of the estimates are reproduced. 

Mean square within bales (W), as evaluated in these computations, 
includes both variation of the material and testing error arising from 
the operations of scouring, removing burrs, etc. which require con- 
siderable care with small amounts of wool (about 1/4 pound) such as 
are available from individual cores. So long as cores are to be tested 
individually separate evaluation of these two components is not re- 
quired. But when considering the variance to be expected from com- 
posite samples the contribution of each should be separately evaluated. 
It is probable that testing error is lower with large than with small sam- 
ples, but it is not to be expected that its variance will be reduced inversely 
as the number of cores, as will that of the material component. For the 
purpose of presenting an example of the use of variance components we 
proceed tentatively on the assumption that testing error is negligible, 
but conclusions with respect to composite samples may need revision 
if they be later found to contribute an appreciable part of W. 

2. Design of experiments to estimate components of variance. If the 
number of observations to be taken in an experiment is fixed at A = nk, 
and if conditions are such that every observation will be on the same 
sort of sample (here a ‘core’, i.e. ignoring other conformations possible 
with composite samples), Hammersley [5] has shown that the estimate 
of the between bales variance component will have maximum precision 
when the number, k, of units per group (cores per bale) is constant and 


Ao, + (A + 
Ao; + 20%, 


For example if the number of determinations is fixed at A = 60 and 
o. = o, = 1, then the optimum solution would be k = 2, n = 30 or two 


(k = 2) 


ate 


86 BIOMETRICS, MARCH 1951 


cores from each of 30 bales; for o; = 1, «2 = 3, it turns out that k = 4 
and n = 15 is the optimum allocation. 

Critical values of the ratio o;/o; for changing from (k + 1) units 
per group to k units per group to obtain maximum precision in esti- 
mating s; can be computed.* Ranges for the ratio for values of k and 
A are shown in Table II. 


TABLE II 


ENTRIES SHOW RANGES FOR THE RATIO9;?/o,? FOR VARIOUS VALUES OF k TO OBTAIN 
MAXIMUM PRECISION IN ESTIMATION OF o;?, BY ANALYSIS OF VARIANCE TECH- 
NIQUE, FOR FIXED NUMBER A = nk SAMPLING UNITS. 


ll 
N 
ll 
w 
ll 


20 > .568 .278 — .568 .174 — .278 
40 > .638 .348 — .638 .231 — .348 
60 > .661 .368 — .661 -250 — .368 
100 > .679 .384 — .679 -266 — .384 
© >.707 408 — .707 .289 — .408 


In practice the clean content value is not determined for the indi- 
vidual cores because this is a costly and time consuming business as 
there are usually over 100 cores taken from a lot. Instead all the cores 
are composited and analyzed as one large sample. The clean content 
value determined from the composited sample will have the same pre- 
cision as if all cores had been tested individually (assuming negligible 
analytical error) but information on the between and within compon- 
ents of variance for the lot being sampled is lost. 

3. Composited samples: alternate lots samples with different number of 
cores per bale. Values from composited samples can reveal information 
on the between and within bale components of variance if the number 
of cores per bale differ from sample to sample ([4], p. 164). If from lots 
of wool of the same type and from the same source n bales. are drawn 
with k, cores per bale in odd numbered lots and k, cores per bale in 
even numbered lots, k; < k, , then the variance of the lot averages will 
be, for odd numbered lots 

2 
Tw 


2 
2 2 
= 


*By equating expressions for V (6,2) for k and (k + 1) cores per bale and A fixed, the critical ratio 
= is found to be given by 
A? — A(2k +1) + 2k 


(Ag + 1)? = ke 


| 
my 
at 
| Hd 
; 
id 
» 


SAMPLING OF BALED WOOL 87 


and for even numbered lots 
= om + 


where o,, is the variance component between lots. 
If and are estimates of and we can estimate 


no, + = — 8). 
ke k, 


If o~ may be assumed negligible the latter may be used as an estimate 
of o; . The assumption will not usually be justifiable, but it may still 
be convenient to use the estimate remembering that it may be biased 
so as to lead to an over-estimate of the number of bales required to be 
sampled.* 

If 2M lots are observed (i.e. M — 1 d.f. for estimating Sj and S;) 
the variance of these estimates would be (assuming o;, = 0) 


V(é2) M 1 =) + o>) 


and 


2 2 


which would in practice be evaluated by substituting Sj for oj and S; 
for o>. 

As might be expected, this method is not very precise—the number 
of lots needed for even a moderate degree of precision is quite large. 
For example, for 2 = 1 and o = 1, k, = land k, = 4,n = 100, 


572 


V(ée) = 


whence for V(é2) 
andfor V(é) 


1/4, M must equal 160, 
1/4, M must equal 30. 


*In a similar way all three components could theoretically be separated by observing three sets 
of lots with different values for both n and k. It seems likely however that such estimates would 
usually be too inaccurate to be useful and this extension has not been studied in detail. 


: 
2 2 
Fw 
2 
a 
a 
‘ 
t 


88 BIOMETRICS, MARCH 1951 
For = = 4, = = 4, n= 100; 
316.3 103 
V(é2) = and V(é) = 


whence for V(é¢2) = 1, M must equal 317, 
andfor V(é) = 1, M must equal 104. 

The variance of the estimates is smallest when k, — k, is large, the 
best choice seems to be k, = 1 and k, as large as possible. 


TABLE III 


AVERAGE PERCENT CLEAN CONTENT OF COMPOSITED SAMPLES OF URUGUAYAN 

WOOL. M = 7 “LOTS” WERE SAMPLED n = 1 BALE PER LOT AND ki = 1 CORE PER 

BALE: 7 OTHER “LOTS” WERE SAMPLED 1 BALE PER LOT AND kz = 4 CORES PER BALE. 

ESTIMATES OF THE BETWEEN BALE AND WITHIN BALE COMPONENTS OF VARIANCE 

ARE SHOWN. (DATA FROM CUSTOMS LABORATORY, BOSTON THROUGH PERMISSION 
OF LOUIS TANNER). 


Percent clean content of sample composited Percent clean content of sample composited 
from 1 core per bale from 1 bale per lot from 4 cores per bale from 1 bale per lot 

“iat” 2 58.31 “Lot” 2 55.48 

3 58 .46 4 57.81 

5 54.71 6 57.23 

7 60.23 8 57.33 

9 61.52 10 58.86 

1l 62.23 12 58.78 

13 60.88 14 60.78 

Average '59.48 Average 58.04 

Si? = 6.577 S2 = 2.749 


1. Estimate of components of variance 
Estimate of »2 


é = ()nmo.sz7 — 2.749) = 5.104 


Estimate of 


= 6.577] = 1.473 


2. Approx. V(é2) = \(; -) [(1)(4)]’ {(6.577)? + (2.749)?} 


30.11 


3. Approx. V(é;) = (, J ;) + (4)°(2.749)"} 


= 6.08 


| | 

: 

| 

4 
! 

Fee 


SAMPLING OF BALED WOOL 89 


To illustrate the use of this technique, the 7 bale averages shown in 
Table I are reproduced in Table III along with values for single cores 
from each of 7 other bales from the same lot of wool. The 7 bale 
averages based on 4 cores per bale (from Table I) will be treated as if 
they represent the averages for 7 lots of wool sampled one bale per lot 
and 4 cores per bale. Similarly the 7 bale values based on 1 core per 
bale will be treated as if they represent the averages for 7 lots sampled 
one bale per lot and 1 core per bale (no assumption about homogeneity 
of lots need be made in this instance, since both sets of 7 bales are a 
random sample from the same lot.) 

The method of computing the estimates of the between bale com- 
ponent of variance o; and the within bale component of variance, o2 are 
shown in the table along with approximate values for the variance of 
these estimates. 

It should be noted that a negative value for the estimate of one of 
the components of variance is possible especially if the between variance 
is very much larger than the within variance. Further, if there is a 
significant lot to lot variation the between bale component will over- 
estimate the actual variation between bales within a lot. 

4. Duplicate composite samples from each lot. An alternate plan has 
been suggested by Tanner of the U. S. Customs Laboratory [6] to avoid 
the question of variation in clean content from lot-to-lot. 

In a series of 2M lots of the same wool type, samples from even 
numbered lots are handled differently from the odd numbered lots as 
follows: 

(a) From each odd numbered lot two cores are drawn from each of n; 
bales; n; cores, one from each bale, are combined “to give two com- 
posite samples. The difference, d; , between these two samples will 
have variance 


nN; 


and the expected value of 


M 
= nid; 
i=l 


2M 


= Si 


will be . 

(b) From even numbered lots two cores are drawn from each of the 
2n; bales of the lot to be sampled. The cores from n; of these bales 
are composited as subsample I and the remainder as subsample II. 
The difference, d; , of the subsample values will have a variance of 


— 
Fg 
2 2 
re 
= GF 
: 
: 


and the expected value of 
M 
njd; 
j=1 2 
2M 
will be 


TABLE IV(a) 


BIOMETRICS, MARCH 1951 


AVERAGE PERCENT CLEAN CONTENT OF COMPOSITED SAMPLES 
FROM 20 LOTS OF GREASY AUSTRALIAN APPAREL WOOL. 
From M = 10 odd numbered lots two cores are drawn from each of n; bales. All the first cores are 
composited as subsample I. The second cores are composited as subsample II. The following data 


were taken: 
No. of bales Average clean content: 
Lot taken from lot 

number ns Subsample I Subsample IT Diff. d; 

1 60 57.86 57.17 .69 

3 65 62.73 62.90 a7 

5 50 66.07 66.80 .73 
7 70 48.89 48.66 - 

9 60 56.72 56.39 .33 

11 60 63.22 62.80 .42 

13 65 64.80 65.31 

15 55 58.39 58 21 

17 60 58.18 58 .02 

19 65 64.17 63 .31 


1, Estimate of within bale component of variance: 


103.51 


g2 — 108.51 _ 5 176 
(2)(10) 
2 
2. Est. V(¢2) = = 5.357 


whence 


Si 


| 
| 
2 
2 
> 
eg 
4 
2 w 
+>): 
: 
ak 


SAMPLING OF BALED WOOL 91 


é; = Ss — 
The estimates will have variances 


M 


which can be approximated by substituting Si for of and Sj for 
(o; + o2/2). An example is shown in Table IV. 


Vee) = 


TABLE IV(b) 


From M = 10 even numbered lots two cores are drawn from each of 2nj; bales. The cores from each 
of nj of these bales are composited as subsamvle I, the remainder composited as subsample II. 
The following data were taken: 


No. of bales Average clean content: 
Lot composited per 
number subsample n; | Subsample I Subsample IT Diff. d; 
2 70 62.41 60.86 1.55 
4 65 60.47 59.52 95 
6 65 59.02 60.35 1.33 
8 65 55.84 56.88 1.04 
10 55 59.51 60.09 -58 
12 65 60.08 59.38 .70 
14 55 48.06 47.38 -68 
16 60 61.73 60.98 75 
18 65 58.38 59.09 Feri | 
20 60 59.90 58.81 1.09 
625 


1. Estimate of between bale component of variance 


2 _ 625.71 5.176 _ 


2 
2. Est. V(é;) = {31.297 + = 197.1 


These data are used with the kind permission of Mr. Louis Tanner, U. S. Customs Laboratory, 
Boston, Massachusetts. The data were presented at a meeting of Section D-3 of A.S.T.M. on 18 
October, 1950. : 


¥ 
2 4 
= 
2 2\2 4 Vea 
Vie) = Fw 
ts) 


92 BIOMETRICS, MARCH 1951 


Ifcs = o, = 4, V(é3) = = and Vig) = — 


In order that the s.d. of o; be 1, M must equal 80, or 160 lots must be 
sampled in all. 

This method could easily be extended to cover cases where other 
numbers of cores per bale are taken. 

The wool industry has tabulated values for the between bale and 
within bale components of variance for the common types of wool and 
source or country of origin [1]. Data on these components of variance 
are constantly being obtained as a check on the persistence and ac- 
curacy of the tabulated values. 

Most of the data available on wool comes from composited samples 
and over the years values for a large number of lots will be available. 
The last two methods described above for composited samples may 
prove useful in analyzing such data. 


PRECISION OF ESTIMATION OF THE LOT AVERAGE 


Once sufficiently reliable values of o; and o2 are obtained the prob- 
lem of just how many cores per bale and how many bales are to be 
drawn from a lot can be tackled. 

If o; and o{ are known the precision of the average for a lot can be 
computed for a prescribed sampling plan. For example if from a lot 
of N bales are selected n bales from which k cores per bale are taken 
the variance of a lot average will be 


= 2 2 


The number of cores that can be taken from a lot is not infinite but 
little is lost in assuming that this is so, since bales weigh from 200-1000 
pounds, the average core about 1/4 pound. However the number of 
bales in a lot is finite, hence the term (NV — n)/N. Note that it is the 
precision of a lot average that is sought not the precision with which 
the wool population average is estimated. 


SAMPLING PLANS WITH EQUAL PRECISION 


Consider two sampling plans, one with k cores from each of n bales, 
the other with / cores from each of m bales. From equation (1) it can 
be seen that they will have equal precision when 


(1_1)_ 


“ 
/ 
448 
| 
ae 
i 
4 
4 
5 


SAMPLING OF BALED WOOL 93 


that is when 
m_ ot 1/l 


n otI1/k 
where ¢ = . 
Table V shows some plans with equal precision. A choice can be 
made according to costs of sampling. 


TABLE V 


SAMPLING PLANS OF EQUAL PRECISION. 
The table shows, for three values of ¢ = 02/0 ?, the ratio (m/n) of number of bales to be sampled for 
equal precision with 1 cores per bale as compared to & cores per bale. (Ratios given for 1 > k; for 
l < k use reciprocals.) 


k l ¢=} e=1 e=2 
1 2 2/3 3/4 5/6 
3 5/9 2/3 7/9 
4 1/2 5/8 3/4 
2 3 5/6 8/9 14/15 
4 3/4 5/6 9/10 
3 4 9/10 15/16 27/28 


SELECTION OF A SAMPLING PLAN THAT PRODUCES 
REQUIRED PRECISION FOR MINIMUM COST 

Sampling conditions differ from warehouse to warehouse. Some- 
times it is possible to sample the lot (i.e. pick the bales to be core-bored 
and core bore them) as the bales are being moved into a warehouse; at 
other times the bales may already be stored in a warehouse and drawing 
a random sample will entail some moving of bales. If it is possible to 
assign a cost c, of getting the bale and a cost c. of core-boring this bale, 
then as a first approximation the cost, C, of sampling will be 


C = n(c, + ck) (2) 


where 7 is the number of bales and k the number of cores to be drawn 
per bale. 

It has been shown (e.g. [4] p. 162) that the value for k to achieve 
minimum cost for a fixed precision in the estimate of the lot mean is 
given when 


C2 


Note that only the ratios o,,/0, and c,/c, need be known and even with 


| 
& 
, is 
pe 
| 
| 
| 
— 
4) 
k Cw =] 
ife 
Te 


94 BIOMETRICS, MARCH 1951 


slight uncertainties in estimating these ratios, especially c,/c. , the 
formula will generally be satisfactory for computing k. 

The number of cores per bale must of course be an integer, whereas 
this formula will usually give a fractional value. Let the nearest integer 
below it be k, that above (k + 1). Then it may be deduced from equa- 
tions (1) and (2) that most information for least cost will be given 
using (kK + 1) cores when 

C, 
Co > k(k 1) 
Otherwise use k cores. If the choice is between k and (k + r) cores 
per bale use (k + r) cores when 
* 


> kk +r) 


C1 
Co 
otherwise use k cores. 

Having thus selected the value for k, the number of cores to be 
drawn per bale, a value of m can be determined so that the precision of 
the value for the lot, #, is at some predetermined level. 

What degree of precision is needed is an economic problem. In 
general it will be necessary to require that the 100 (1 — 2p) percent 
confidence interval about the mean for the lot be +Z where E is some 
pre-assigned values such a 1 percent or 1/2 percent clean wool content. 
It will be assumed that values for cores from a bale are normally dis- 
tributed about the value for the bale and that the values for the bales 
in a lot are normally distributed about the lot mean. In which case 
the 100 (1 — 2p) percent confidence limits for ¢ are given by 


x—E, and 


where E, = K,o, and K, is the normal deviate exceeded with prob- 
ability p. 
In order to find the value of n for which E = E, we must solve 
N—n)o; , 

B, = Kyo, = KAN = met 

This gives 
+ 
~ + 

which when taken with k as determined above specifies the sampling 
plan for lots of size N and 100 (1 — 2p) percent confidence limits at 


*I am indebted to Dr. Churchill Eisenhart for this formula. 


4% 
in A 
P 
{ us 
4) iv 2 
4 
| 
4 
a 
| 
‘4 
*S 
‘ ; 
4 
i 


SAMPLING OF BALED WOOL 95 


& + E, (or for other limits ¢ + FE, and percent confidence level 100 
(1 — 2a) satisfying E,.K, = E,K,). 
To construct a sampling plan the following quantities must be known 
(a) o2 the within bale variance 
(b) o; the between bale variance, or the value for the ratio o3/c2 
(ce) ¢,/¢. the relative cost of obtaining one bale to the cost of drawing 
one core from the bale. 
(d) E, “allowable” error in the average for the lot, with associated con- 
fidence level, 100 (1 — 2p) percent. 
For example if 


«2 = 4, a; = 4, C,/Ce = 4, N = 400, 


and it is required to find the sampling plan which will evaluate the 
average with 95 percent confidence limits at +-0.5 percent clean wool 


content (i.e. E.oo., = 0.5, K, = 1.96), the following computations are 
made 


n= = 80 
1.96 7 400 
The required sampling plan would call for drawing 2 cores from each 
of 80 bales from the lot of 400 bales. 


CONSTRUCTION OF SAMPLING SCHEDULES 


The procedure for constructing a sampling schedule for baled wool 
involves the following steps: 
(a) Classification of wools by type, source, etc., 


(b) Obtaining values of the between bale and within bale com- 


ponents of variance for each classification, 

(c) Selection of appropriate values of the precision desired in the 
estimation of the lot mean. This means a selection of the 
confidence level to be attached to the value for precision, and 

(d) Tabulation of lot sizes that are met within practice. 

With this information at hand, sampling plans that give the desired 
precision can be worked out for each lot size for the various combina- 
tions of between bale and within bale components of variation. If the 
number of bales to be taken is computed for different numbers of cores 
per bale, so that the user of the schedule has several plans to choose 


2 
: 
3 


96 BIOMETRICS, MARCH 1951 


from, the table will be useful under a wide variety of sampling cost 
situations. The user must know the relative cost of drawing a bale 
from a lot to the cost of drawing a core from a bale to make the most 
economical solution. 

The above methods have been applied in the preparation of the 
ASTM sampling schedule for baled wool [1]. 


SUMMARY 


The preparation of a sampling schedule for packaged bulk material 
such as wool requires the determination of the between package and 
within package components of variance. These components of variance 
can be estimated by the analysis of variance technique if data from 
individual sampling units from packages are available. Often the 
individual sampling units are combined into a composite sample for 
laboratory analysis. The variance components for the individual samp- 
ling units may be obtained from such composite samples provided at 
least two different schemes of forming the composite samples are em- 
ployed. 

The construction of the most economical sampling plan with a de- 
sired precision can be achieved if the relative costs of drawing a package 
from a lot and drawing a sampling unit from a package are known. 


The author wishes to acknowledge with thanks the helpful sug- 
gestions made by H. Fairfield Smith. 


REFERENCES 


1. American Society for Testing Materials. “Tentative methods of core sampling of 
wool in packages for determination of hard scoured wool content’: ASTM 
Designation: D1060-49T, 1949 Book of ASTM Standards, Part 5, 14-19. 

2. Crump, 8. L. “The estimation of variance components in analysis of variance’’, 
Biometrics, 2, 7-11, 1946. 

3. Deming, W. E. ‘On the sampling of physical materials’, presented at the meeting 
of the International Statistical Institute held in Bern, 5-10 Sept. 49. 

4. Deming, W. E. Some theory of sampling, John Wiley and Sons, 1950. 

5. Hammersley, J. M. “The unbiased estimate and standard error of the interclass 
variance”, Metron, 15, 189-205, 1949. 

6. Tanner, Louis. Private communication dated 28 August 1950. 

7. Tanner, Louis and Deming, W. E. “Some problems in the sampling of bulk 
materials”, ASTM Proceedings, 49, 1949. 

8. U.S. Department of Agriculture. ‘Core sample analysis for determining shrinkage 
of grease wool”, U.S.D.A. Production and Marketing Administration, Livestock 
Branch. March 1949. 

9. U. S. Department of Agriculture. “Comparison of core tests and visual estimates 
of shrinkage with actual mill scouring results on 96 lots of wool,’’ Production and 

Marketing Administration, Livestock Branch. 3 March 1949. 


4 
2 
i 
ne 
pens, 
} 


VARIANCE COMPONENTS AS A TOOL FOR THE 
ANALYSIS OF SAMPLE DATA 


Watter A. HENpDRICKs* 


Bureau of Agricultural Economics, 
U. 8. Department of Agriculture 


; pen DESIGN OF A SAMPLE usually involves some method of taking 
account of heterogeneity in the universe to reduce the size of the 
random sampling error. The degree of success with which this objective 
is attained depends upon the extent and precision with which various 
contributions to the variability of the individual units in the universe 
are isolated and measured before the sample is designed. The amount 
of pertinent information about the universe that is available before a 
sample is selected is sometimes meager and sometimes quite far-reaching; 
it may have become available from previous sampling investigations or 
from previous contacts with every individual in the universe. But 
regardless of how much information is available, or how it was ob- 
tained, an experienced sampler takes advantage of it. 

Mathematically, the heterogeneity of the individuals in a universe 
with respect to a variable X can be expressed as a series of sums of 
squares which add up to the total sum of squares S(X — m)’, where 
m is the average of all values of X in the universe. The number of 
component sums of squares into which such a total can be resolved is 
limited only by the amount of available information concerning the 
universe and the number of individuals in the universe. This simple 
algebraic identity serves as a basis for constructing the formula for the 
sampling variance of a mean, ¢, computed from a sample with a specified 
design. 

The synthesis of such a formula can be accomplished in more than 


*Presented before the American Statistical Association annual meeting in Chicago, 1950 at 
sessions held jointly by the Biometrics Section of the American Statistical Association and The Bio- 
metric Society (ENAR). 


97 


* 
= 
| 


98 BIOMETRICS, MARCH 1951 


one way. A paper by Cochran’ appears to be the pioneer publication 
on the subject, at least in this country. For samples drawn from an 
infinite population the formulas of analysis of variance were already 
available. But those formulas are not applicable to sampling from 
finite populations unless the sampling units included in the sample are 
a small fraction of the total number of such units in the universe, in 
which case the finite nature of the universe can be ignored. All mean 
squares and pure variance components isolated in an analysis of variance 
apply to a hypothetical infinite supply of every kind of unit considered 
in the analysis. ; 

At first glance, this would seem to be a serious objection to the 
application of such a procedure to data from a finite population. How- 
ever, it happens that, even though all variances of individual units of 
different kinds estimated from the data apply to hypothetical infinite 
supplies of such units, it is possible to use such variances to compute the 
sampling errors of averages for samples from a finite universe by intro- 
ducing a very simple adjustment. 

To illustrate, consider a simple sub-sampling problem. Assume that 
m primary units and k secondary units per primary unit have been 
selected at random from a finite universe consisting of N primary units 
with K secondary units per primary unit. Let o? and o” represent the 
pure variance components between primary units and within primary 
units. These quantities apply to a hypothetical infinite supply of 
primary units with a hypothetical infinite supply of secondary units 
within each primary unit. If the sample of nk observations had been 
drawn from such an infinite universe, the variance of the average from 
the sample of nk observations would be o:/n + o°/nk. The correction 
for the finite nature of the universe is made by simply substituting the 
universe values, N and K for n and k in that expression and subtracting 
the result. One thus obtains 


4 
= °\nk ~ NK)" 


In this equation o; is the variance that applies to samples from the 
finite universe of NK observations even though o% and o” have been 
defined in terms of a hypothetical infinite universe. The derivation of 
this kind of formula depends upon the concept that the finite universe 
of NK observations is itself a sample from the hypothetical infinite 
universe to which and o” apply. 


1Cochran, W. G. The use of analysis of variance in enumeration by sampling. Amer. Statis. Assoc- 
Jour. 34:492-510, 1939. 


| 
4, 
i 
a 
4 
q 
4 
; 
|. 


ANALYSIS OF SAMPLE DATA 99 


At first glance it might appear that using the concept of an infinite 
universe in this way introduces an unnecessary complication into the 
picture. It seems simpler to substitute for of and o* some analogous 
expressions representing merely the averages of the squares of the 
appropriate deviations in the finite universe for the N primary units 
and the NK individual secondary units. That approach is in fact the 
more widely used at the present time and it will yield identical numerical 
results. What then is the justification for the other procedure? First, 
it is common knowledge that the formulas applicable to samples from 
an infinite universe are usually simpler than those applicable to a finite 
universe. Furthermore, when individual variance components are stated 
in terms of an infinite universe, the adjustment needed in a formula to 
make it apply to a finite universe is also a simple matter. On the 
other hand, if we give up the mathematical model of the infinite uni- 
verse, the derivation of the corresponding formulas becomes a more 
tedious’ chore and one that the author has found to be less readily 
grasped by students, particularly those students who are already familiar 
with the formulas that apply to an infinite universe. For example, con- 
sider a mean, <, computed from a simple random sample of n that was 
drawn from a finite universe of N having a mean, m. Most students 
know that if the sample of n were drawn from an infinite universe we 
have, 


2 
o: = E@ — — 


To make the transition to the finite case, all that is needed is to 
show that 


E(é — m)’ = E@ — — E(m — 


in which o” is defined in terms of an infinite universe of which the finite 
universe of N observations is itself a sample. 

A point worth keeping in mind is that the variance components in 
the infinite population can usually be estimated from simpler formulas 
than can the corresponding values based on the finite universe definition 
when only a sample of data from the finite universe is available for 
their estimation. With the infinite universe concept, those components 


2For example, see Deming, W. E. Some Theory of Sampling. John Wiley and Sons, New York, 
1950. 


mee 
| 
4 
or 
2 2 
% = n N 
‘ = 


100 BLOMETRICS, MARCH 1951 


can be estimated from the same kind of ordinary analysis of variance 
table, regardless of whether all or only a sample of observations is 
available for analysis. When this approach is used, we see that the 
mathematical theory of sampling from finite populations is in fact not 
a separate theory at all; it falls entirely within the framework of con- 
cepts and definitions established for the theory of sampling from infinite 
populations. Any idea that there is a sharp line of demarcation between 
the theory of sampling from finite populations and the theory of samp- 
ling from infinite populations is illusory. 

To a considerable degree, these arguments are academic. The prac- 
tical sampler is often forced to use fairly rough estimates, if not outright 
guesses, of the numerical values of the variance components entering 
into his formulas. Under these conditions, the fine distinctions between 
the two mathematical models are hardly worth worrying about. The 
kind of situation frequently encountered in practice may be illustrated 
by an example from a beef-cattle price sampling project conducted by 
the Bureau. Data from a sample of 15 commission houses in Nebraska 
for 5 months show the kind of situation the statistician is up against. 
During one week of each month all individual sales for each of the 15 
firms were tabulated. The pertinent data are the total weight of cattle 
in each lot sold and the dollars paid for the lot. The average price for 
all sales is computed by dividing the total dollars paid for all sales by 
the total live weight of all animals sold. The problem was to estimate 
the variance components representing variability in price between firms 
and between sales by the same firm. The purpose of such analyses was 
to measure the principal sources of sampling error in the estimated 
average price for any one month to determine: 

(a) The precision with which such estimates were being made cur- 

rently. 

(b) Whether or not the allocation of effort with respect to number of 
firms included and number of individual sales tabulated for each 
firm was fairly near the optimum. 

(c) Whether or not the sample needed to be enlarged or reduced, 
and, if so, should changes be made in the number of firms, num- 
ber of sales tabulated per firm, or both. 

As the average price is estimated as a ratio, the analysis of variance 
was set up in such a way that all variances were in terms of the per- 
centage variability of the prices. The average values of the variance 
components over the 5-month period were estimated as follows, 


Between firms: ¢7 = 78 
Within firms: o* = 878 


i 
a 
qo 
1 
: 
| 
ae 
| 


ANALYSIS OF SAMPLE DATA : 101 


In other words, on a per sale basis, the percentage standard deviation 
of prices seems to be about 9 percent between firms and about 30 percent 
within firms. 

As several hundred sales are tabulated each month, one would expect 
these variance components to measure differences between and within 
firms fairly accurately. But when they are estimated separately for 
each month they show considerable variation from month to month: 


August 
September 
October 
November 
December 


These results will not surprise anyone who has had occasion to work 
with variance components; they are quoted here to emphasize the fact 
that in practical work an evaluation of the efficiency of a sample design 
is usually only an approximation to the truth. But even so, the statisti- 


cian is better off than if he were working completely in the dark. 

For arriving at an appraisal of the adequacy and efficiency of the 
sample design, it was decided to use the average values, ¢; = 78 and 
o = 878, throughout. There is obviously no justifiable reason for as- 
suming that the true values of those quantities are constant from month 
to month, but it was also believed that it would be futile to attempt any 
further refinement of the computations. The sampling variability in 
the monthly estimates is so large that the use of any estimates other 
than the over-all averages could hardly be regarded as an improvement. 
Conclusions from the analysis were to apply to future work, and even 
if it were known that the true values of the variance components are 
not constant, nothing much could be done about it unless those changes 
were predictable. We do not have sufficient data at hand here to make 
any valid predictions of that kind. Furthermore, conclusions based on 
such an appraisal of the sample design do not require an extreme degree 
of refinement. It is well known that a sample design does not have to 
be exactly optimum; it is usually only when the design departs con- 
siderably from the optimum that the loss in efficiency becomes seriously 
large. 


2 2 
30 1244 
227 968 ae 
74 887 
62 847 — 
26 420 
of 
j 


ESTIMATING PRECISION OF TEXTILE INSTRUMENTS 


Joun C. WHITWELL* 


Department of Chemical Engineering, Princeton University, and 
Textile Research Institute, Princeton, N. J. 


O° OF THE Most consistently difficult problems in the textile industry 
is the measurement of moisture for the assessment of value of 
material bought or sold on a dry basis. Complete solution of this 
problem involves a more complete knowledge of the relation of the 
water and textile fiber than is available. Complications exist in the 
many forms of water known to be associated with natural and synthetic 
fibers. Adsorbed, absorbed, and chemically bound portions contribute 
differently to different methods of estimation of total moisture content. 

The method of assessing moisture which is the oldest and best 
known is by loss-in-weight during drying, here designated a. the oven 
method. This method is not only time-consuming but is also unsatis- 
factory in many instances, and particularly with fabrics because of the 
necessity of cutting samples. Another is measurement of some electrical 
property, such as resistance, impedance, capacitance, or dielectric con- 
stant, which shows an appreciable change with change in moisture 
content of the sample. The oven has more apparent validity in its 
direct measurement of amount of water volatilized, but it is only an 
arbitrary standard. Calibration of another method against oven moist- 
ure should, therefore, provide an alternate arbitrary standard which 
could be considered an equally good measurement. 

The problem of measurement of a quantity whose exact relation to 
a desired property is not clearly defined may seem, at first glance, to 
be unusual. However, further consideration will undoubtedly provide, 
in the mind of any experimentalist, many similar instances. This phase 
of the problem will therefore be dismissed without additional discussion. 

Calibrations of several types of electrical meters intended for use in 
measurement of moisture in textiles have previously been discussed 
(3, 4, 6, 7). Variables which produce a major change in the dependence 
of meter readings upon the moisture, as measured by the oven method, 
have been emphasized, particularly in the second and fourth papers of 


*Presented before the American Statistical Association annual meeting in Chicago, 1950 at sessions 
held jointly by the Biometrics Section of the American Statistical Association and the Biometric 
Society (ENAR). 


102 


i 
ate” 
a: 
4 
+, 
bin: 
| 
| 
a 
| 
| 
J 
He 
ig 


PRECISION OF TEXTILE INSTRUMENTS 103: 


the series. However, for routine control in mass production, the cali- 
bration is required to deal only with one type of material (e.g., one 
fabric, or one sort of raw stock) and only under specific conditions which 
should be sufficiently constant so that a single arbitrary scale, estab- 
lished for this situation, would be suitable. It is then important to de- 
termine the stability of the measurement of moisture if material of the 
same moisture content were repeatedly provided for analysis. This 
problem of precision in moisture measurement either by oven or by 
instruments was the major task. 

Methods of treating the data include both standard analysis of 
variance and separation of variance techniques. (1, 2) The study of the 
latter is secondary to the main problem, but is of interest since it ap- 
peared to be the only recourse in analysis of data from earlier experi- 
ments (3, 6). Comparison of separation of variance and analysis of 
variance in this instance, where both are possible, should provide in- 
formation as to the adequacy of the conclusions drawn from the earlier 
work. 

The close of this paper contains some data on the stress-strain 
moduli of yarns which have also been studied with the intention of 
providing an estimate of the relative precision of measurement inherent 
in two different types of measuring instruments. 


ELECTRICAL MOISTURE INSTRUMENTS 


Two instruments were used in the present work, both of which 
measure d.c. resistance, and both of which were models of the Hart 
Moisture Meter, one designed for wool and one for cotton. The method 
of operation has been described (3, 7). 


SAMPLES 


Three fabrics were tested, the nature of the material involved gov- _ 
erning the choice of instrument. Two of the fabrics were woven from 
virgin wools, one being a standard worsted gabardine, dyed brown, and 
the other a gabardine with worsted warps and woolen filling, dyed black; 
they weighed 6.5 and 8 ounces per square yard respectively. The other 
fabric was an ordinary muslin (cotton), 5 ounces per square yard. 

Samples for tests were conditioned in a room where temperature and 
humidity were closely controlled until moisture equilibrium was so 
nearly attained that no further change would occur during test. The 
oven samples were bottled, weighed, dried, and reweighed; moisture con- 
tent was calculated as loss-in-weight divided by original weight and the 
result was reported as a percentage. Instrument samples were tested 
as soon as Oven samples were bottled. 


feat: 
: 
| 


104 


Sampling procedure. Meter measurements were made at predeter- 
mined positions in the fabric samples as shown in Figure 1. There are 
sixteen oven test positions and twenty-five instrument test positions. 


WARP FILL————> 


BIOMETRICS, MARCH 1951 


| 2 3 4 5 
A 

6 7 8 9 10 
E 

I 12 13 14 1S 
I 

16 17 18 19 20 
M 

2i 22 23 24 25 


METER SAMPLES DESIGNATED BY NUMBER 


OVEN 


No complication from this pattern arose in analysis of variance, but 
for separation of variance components it was necessary to pair adjacent 
This pairing was accomplished by the 
discarding of one row of meter readings (along the filling) and one 
column of meter readings (along the warp), leaving sixteen meter 
readings to correspond to the sixteen oven sub-samples. The method 
of selecting the row and column to be removed was by the use of random 
numbers. Studies were made of the effect of removing different rows 
or columns and it was concluded that the results were not generally 


meter and oven sub-samples. 


» LETTER 
FIGURE I. SAMPLE PATTERN FOR FABRIC USED IN OVEN AND METER TESTS. 


sensitive to the particular group of sub-samples removed. 


at 
ee 
Vee 
a if 
é 
Ge 
| 
4 
1 
4 
| | 
: 
4 
is 
= 


PRECISION OF TEXTILE INSTRUMENTS 105 


TABLE I 
DATA ON HART WITH FABRICS 
Material Date Hart Oven No. Oven | Temp., b 

Av. Av. Samples °F used 

Brown 5-6-49 1.28 5.64 16 80.4 1.00 
Gabardine | 4-27-49 1.31 4.97 16 80.4 1.00 
4-7-49 6.11 10.30 16 50.4 1.00 

3-16-49 2.39 7.68 16 50.6 1.00 

Black 4-6-49 4.67 10.09 16 50.3 1.00 
Gabardine | 3-15-49 2.57. 8.29 16 50.0 1.00 
1-24-49 5.16 7.79 16 89.0 1.00 

1-12-49 2.42 5.63 16 91.5 1.00 

12-30-48 7.58 11.62 16 70.6 1.00 

9-9-48 7.05 11.37 16 72.7 1.00 

8-16-48 6.08 10.15 16 70.0 1.00 

8-23-48 3.91 8.12 16 72.7 1.00 

Muslin 1-13-49 4.19 3.03 16 90.7 1.46 
1-25-49 6.89 5.02 16 89.0 1.46 

3-14-49 4.43 5.34 16 50.4 1.46 

4-5-49 7.06 6.96 16 51.0 1.46 


TREATMENT OF DATA 


Oven data are assumed to follow the mathematical form 
Xi; = ety t (1) 


The symbol X;; is used exclusively to indicate one moisture content as 
obtained from one oven analysis; u is the general effect, 7, is the variable 
contribution of the sample and ¢;; that of the oven. On a comparative 
basis the meter readings could be represented as 


Yi; =v+a;+ €rij (2) 


where the symbols have the same relative meaning as in Equation (1). 
If there is a known relation between X and Y, the mathematical 
model of the Y-data could be rewritten 


Yi; =vt+ + (3) 


where 8 is the regression coefficient at the point Y,; ; b will be used as 
an estimate of this coefficient from the data. 
All Hart calibration lines were found to be linear and values of b 


| 


106 BIOMETRICS, MARCH 1951 


TABLE II 


HART AND OVEN 
SEPARATION OF VARIANCE 


Date n 8.2 So? 87? b*s?/s;? 


Muslin (6 = 1.46) 


1-13-49 16 0.0036 | 0.0194 zero zero 
1-25-49 16 0.0016 | 0.0149 0.0013 0.0006 
3-14-49 16 0.0001 | 0.0155 0.0021 0.0010 
4-5-49 16 0.0051 0.0100 zero zero 
Pooled (64 df.) 0.0150 0.0004 37.5 
Black Gabardine (b = 1.00) 
8-16-48 16 —0.0005 | 0.0065 0.0024 
8-23-48 16 0.0038 | 0.0085 0.0046 
9-9-48 16 0.0017 | 0.0098 0.0017 
12-30-48 16 0.0003 | 0.0225 0.0035 
1-12-49 16 0.0201 | 0.0029 0.0178 
1-24-49 16 —0.0051 | 0.0222 0.0124 
3-15-49 16 0.0002 | 0.0067 0.0051 
4-6-49 16 —0.0016 | 0.0215 0.0142 
Pooled (120 df.) 0.0126 0.0077 1.64 
Brown Gabardine (b = 1.00) 
3-16-49 15 —0.0034 | 0.0310 0.0077 
4-7-49 16 —0.0023 | 0.0301 0.0126 
4-27-49 16 0.0015 | 0.0222 0.0006 
5-6-49 16 0.0029 0.0126 zero 
Pooled (59 df.) 0.0238 0.0052 4.58 


were estimated by least squares despite the known errors in both 
variates. This procedure may be easily justified. If the muslin experi- 
ment, for example, had been subjected to a joint analysis of variance 
(5), the mean square for oven moisture between different moisture con- 
tents would have been about 38.08, while the correction to be subtracted 
for oven variation is seen from Tables II or III to be 0.013 to 0.015*, 
which is clearly negligible. 


*In a previous paper (3) instances were found with the Steinlite Moisture Tester where the 
calibration was curvilinear. By the same reasoning as used above, least square curves were fitted to 
the data and the individual values of b calculated for each group of points as the slope of the curve 
at the average oven moisture. 


1} 
+ 
+ 
4 
ES 
AL 
~ 
4 
; 
hie 
| 
4 
| 
| 
4 
| 
us 
é 


PRECISION OF TEXTILE INSTRUMENTS 107 
TABLE III 
HART AND OVEN 
ANALYSIS OF VARIANCE 
Date (WF)o df dfp | (WF);,/b* b°(WF)o/(WF)r 

Muslin 

1-13-49 0.0093 9 16 0.0002 

1-25-49 0.0108 9 16 0.0019 

3-14-49 0.0182 9 16 0.0009 

4-5-49 0.0151 9 16 0.0014 

Pooled 0.0134 0.0011 12:3°* 
Black Gabardine 

8-16-48 0.0085 9 16 0.0009 

8-23-48 0.0087 9 16 0.0115 

9-9-48 0.0063 9 16 0.0033 

12-30-48 0.0282 9 16 0.0026 

1-12-49 0.0086 9 16 0.0058 

1-24-49 0.0194 9 16 0.0050 

3-15-49 0.0115 9 16 0.0034 

4-6-49 0.0241 9 16 0.0052 

Pooled 0.0144 0.0047 3.06** 
Brown Gabardine 

3-6-49 0.0324 8 16 0.0047 

4-7-49 0.0247 9 16 0.0071 

4-27-49 0.0078 9 16 0.0066 

5-6-49 0.0207 9 16 0.0061 

Pooled 0.0214 0.0061 3.51%* 


Some investigation was essential to indicate whether the variance 
estimates were dependent upon the magnitude of the moisture contents. 
There is no evidence that such dependence exists in the current data. 

The data may be treated essentially as outlined by Grubbs in 1948 


(1) with the reservation implied in equation (3). 


This situation has 


been treated by Toner et. al. (3) and Smith (2), correcting for the 
difference in measurement scales and consequent possibility of de- 
parture of b from 1.00. 

The separation of variance components is indicated in the following 
manner: 


: 
1 
2 
‘ 


108 BIOMETRICS, MARCH 1951 
s=si +s (4) 


s, = bs; + (5) 


where s° is an estimate of variance and the subscripts x, y, s, 0, and J 
refer to oven data, instrument data, sample, oven alone and instrument 
alone respectively. Then if the sample variance is uncorrelated with 
the oven or instrument variances, which seems reasonable, 


= bs; (6) 


where s,, is the estimate of covariance. Substitution of equation (6) 
in (4) and (5) and rearrangement of terms will result in the following 
equations: (see next page) 


TABLE IV 
Sample No. Dynamic Modulus (DM) Static Modulus (SM) 
dynes/cm? X 107” dynes/em? X 107” 
1 4.97 3.76 
2 4.85 3.60 
3 4.95 3.55 
4 4.97 3.82 
5 4.92 3.69 
6 5.09 3.66 
7 4.90 3.68 
8 4.92 3.64 
9 5.11 3.81 
10 4.97 3.72 
1l 4.97 3.77 
12 4.97 3.74 
13 4.90 3.74 
14 4.92 3.77 
15 4.83 3.69 


= 0.00566; = 0.00575 
8, = 0.00204; b=0.75= 
= 29.4 X 10°° 


= 42.2 10°; 42.2 x 10'°/(0.75)” = 75.1 x 10° 


8; = 27.2 x 10° 


; 
i 
4 
a 
4 
qt 
be 
’ 
1 
“a | 


PRECISION OF TEXTILE INSTRUMENTS 109 


= — 8,,/b (7) 


8; = 8, — bs,, (8) 


It is apparent that the sampling procedure, producing matched pairs 
for oven and instrument, does not yield material directly applicable to 
these equations which are based upon measurements on identical sam- 
ples. The term s~ will include only major material variation leaving 
contributions of variation within pairs of fabric samples in the terms 
s, and s;. This condition could have been avoided by measuring oven 
moisture content on the same sample as that on which an instrument 
reading had been obtained, but the procedure would have been im- 
practical experimentally. Due to difference in size of samples for each 
instrument and for the oven, the oven sample would have had to be 
cut out of the instrument sample, a procedure which was not feasible 
in this experiment. 


RESULTS 


The data and calculations for the Hart tests with fabrics are sum- 
marized in Tables I, II, and ITI. 

There is a temperature effect upon the meters, but the calibration 
lines at different temperatures are parallel; therefore the estimates of b 
reported in Table I were calculated from all the data regardless of 
temperature. The calculated coefficients for the black and brown 
gabardine were slightly above and below 1.00 respectively, but neither 
differed significantly from 1.00. The value of b reported for muslin does 
differ appreciably from 1.00 and its average value was therefore used. 

In Tables II and III, values of instrument variances are divided by 
b’ in order that a comparison of instrument and oven variances, both 
in units of moisture content, might be obtained. Also, in the same 
tables, pooled values of the ratios of instrument and oven variances are 
provided in order to show the magnitude of the relation between the 
variances from the two sources and to provide a ratio of variances 
which could be tested for significance when such a test is applicable. 


DISCUSSION OF RESULTS 


Fabric Tests. The warp-fill (row-column) interaction mean squares, 
labelled (WF) in Table III and calculated by a simple double classifica- 
tion analysis of variance, were assumed to represent estimates of vari- 
ances of measuring methods, with most of the sample variation removed 
as warp and fill effects. The warp mean square exceeded the interaction 
(WF) mean square in 14 experiments out of 16. 


| 
| 
iy 
é 


110 BIOMETRICS, MARCH 1951 


Despite the availability of this efficient estimate of errors, it was of 
interest to calculate the variance components of matched pairs since, as 
noted, this method had necessarily been employed in analysis of other 
instruments in tests performed earlier. 

The results of the calculations, as shown in Tables II and III, show 
no consistent effects which would indicate that the results for any one 
material might not suitably be pooled. 

There are, therefore, six variances available as follows, estimated 
both directly and by separation of variance from matched pairs, all 
values having been multiplied by 10°: 


(WF) (WF), 85 


Muslin ll 4 133 150 
Black gabardine 47 77 144 126 
Brown gabardine 61 52 214 238 


In three of these comparisons the estimates from separation of variance 
are greater than those from analysis of variance but the opposite is the 
case for the other three. Apparently reasonable reliance can be placed 
upon such components when the better estimate is not available. 

The results indicate that the instruments are significantly better than 
ovens in yielding precise estimates of moisture. General figures for the 
improvement would appear to be 1.8 to 1 for the wool instrumert and 
3.5 to 1 for the cotton instrument, both in terms of standard deviation. 
It is improbable that the wool instrument offers as great an improve- 
ment as the cotton instrument. 


MODULUS TESTS 


Results which are also of interest from the viewpoint of variance 
components have recently been obtained on a different type of test. 
Identical yarns were tested in two types of machines with the intention 


. of obtaining measurements of the stress-strain modulus corresponding 


to the normal Hooke’s Law modulus for extensible materials in tension 
tests. One machine was of dynamic resilience type where the yarn is 
maintained in forced vibration and the modulus is estimated from the 
resonant frequency; it was specially designed and constructed in the 
Textile Research Institute laboratories. The other was basically similar 
to the usual tension tester but with the yarn strained at constant rate, 
the stress being measured by strain gages and autographically recorded 
as a function of the extension; it was designed and built by the Instron 
Machine Company. This test might have been a pure case of com- 


| 
an 
: 
4+ 
j 
y 
us 
fe 
al 
| 
7 
1 


PRECISION OF TEXTILE INSTRUMENTS lil 


parison of two instruments as originally suggested by Grubbs but the 
modulus is apparently dependent upon frequency; that obtained static- 
ally is appreciably lower than that obtained dynamically. The data are 
given in Table IV. 

Lacking other groups of data, the only reasonable estimate of b 
seemed to be based upon a constant ratio between static and dynamic 
moduli—i.e., a straight line through the origin. While further experi- 
mental confirmation would, of course, be desirable, it was felt that this 
method of estimating b could be trusted in this instance. On this basis 
the estimated variances, in (dynes/em’)’ X 10°°, are 29.4 for the direct 
estimate of dynamic modulus, 75.1 for the estimate of dynamic modulus 
by static test and 27.2 for sample-to-sample fluctuation. Since measure- 
ment and sample-to-sample variances were about equal, further im- 
provement in dynamic testing could, at most, halve the number of 
tests required for a given accuracy. Thus additional development did 
not seem worthwhile. 


CONCLUSIONS 


Material has been presented to illustrate the possible application of 
separation of variance technique in research in the textile field. Com- 
ponents so obtained have been compared with the more efficient esti- 
mates from analysis of variance in cases where the experimental design 
allowed such a possibility. The two have been shown to give essentially 
similar results in the present work. The results have been used to show 
the possibility of applying electrical instruments for precise determina- 
tion of moisture in textile materials, the precision being indicated to be 
better than that by an older and more standard method, that of loss 
in weight by oven test. Precautions necessary for the proper use of 
these instruments have not been included in this particular paper. 


ACKNOWLEDGMENT 


The author wishes to acknowledge numerous contributions to work 
reported. Tests were made under the sponsorship of the Textile Re- 
search Institute. The direction and prosecution of the moisture results 
would have been impossible without the assistance of Dr. R. K. Toner 
and Miss Carol F. Bowen, both of the Textile Research Institute staff, 
the former also of the Chemical Engineering Department of Princeton 
University. The modulus data were obtained from work performed by 
Dr. T. F. Evans, also of the Institute staff. Advice from Dr. John W. 
‘Tukey, Mathematics Department, Princeton University, was invaluable. 
‘The source of the meters and samples used is noted in previous publica- 
tions. 


| 
2 
; 
~ 
> 


112 BIOMETRICS, MARCH 1951 


LITERATURE CITED 


1. Grubbs, F. E. “On Estimating Precision of Measuring Instruments and Product 
Variability”, J. Am. Stat. Assn., 43, 243, 1948. 

2. Smith, H. F., ‘Estimating Precision of Measuring Instruments”, J. Am. Stat. 
Assn., 45, 447, 1950. 

3. Toner, R. K., Bowen, C. F., and Whitwell, J. C., “Moisture Determination in 
Textiles by Electrical Meters: Part I’, Textile Research J., 18, 526, 1948. 

4. Toner, R. K., Bowen, C. F., and Whitwell, J. C., “Moisture Determination in 
Textiles by Electrical Meters: Part II’, Textile Research J., 19, 1, 1949. 

5. Tukey, J. W., “Components in Regression”’, Biometrics 7, 33-69, 1951. 

6. Whitwell, J. C., and Toner, R. K., “Moisture Determination in Textiles by 
Electrical Meters: Part III’, Textile Research J., 19, 756, 1949. 

7. Whitwell, J. C., Bowen, C. F., and Toner, R. K., “Moisture Determination in 
Textiles by Electrical Meters: Part IV”, Textile Research J., 20, 400, 1950. 


4 
| 
ae 
| 
| 


QUERIES 


QUERY: In some investigations involving the comparison’ of 
86 samples of measure taken under differing experimental condi- 
tions, the matter of primary interest is the effect of the differential 
treatments upon the variability of the measures, rather than upon their 
central tendencies. For example, suppose subjects were asked to adjust 
themselves to the vertical after being placed in a position of tilt. Let 
the respective tilts be 15, 30, 45, 60, 75, and 90 degrees. We are in- 
terested in the variability change as the tilt increases; the mean error 
of adjustment may or may not be constant for the different tilts, but 
it is of no concern here. 

For a simple classification, such as that described, we need only 
assess the heterogeneity of variance, say by Bartlett’s method; if the 
variance is heterogeneous and if it shows a consistent trend, we have 
the needed information. However, if the classification is multiple; e.g., 
if beside different tilt we vary also sex, and period of delay before 
making the adjustment, the problem becomes more complex. We can 
still test homogeneity of rows, columns, cubicles, etc., but it would be 
nice if we could get information similar to that obtained on a triple 
classification analysis of variance of the original measures. 

It is proposed that, if for each cubicle, the variance is computed, 
then this variance may be treated as a measure of some trait. Inasmuch 
as variances are additive, then the measures so obtained are subject 
legitimately to analysis of variance techniques (provided, of course, that 
the basic assumptions of random sampling, normal universe, etc., are 
met). That is, we propose to make an analysis of variance of variances. 
Under the circumstances outlined, there is of course only one replica- 
tion; i.e., only one measure per cubicle, and we cannot evaluate the 
remainder term. This difficulty could be surmounted by a replication 
of the entire study which would give another set of variances, etc. 

Would you please give me you opinion on the validity of the pro- 
posed technique? 


Your problem has been discussed by M. S. Bartlett and 
ANSWER:  D. G. Kendall in the Supplement to the Journal of the 
Royal Statistical Society, Vol. 7, pp. 128-138, 1946. 
Transformation to logarithms is recommended. But it is pointed out 
that the transformation reaches 90 percent efficiency only if the variance 


113 


| 
= 
: 


114 BIOMETRICS, MARCH 1951 


in each cubicle has 10 degrees of freedom. With 20 d.f. the efficiency 
rises to 95 percent. 

So far as I can see, there is no reason for providing more than one 
estimate of variance per cubicle. The basic estimate of experimental 


error is the highest order discrepance in your table of multiple classi- 
fication. 


PatMER O. JOHNSON. 


QUERY: I have construcced a scale for measuring interests in 
87 various science subjects by a procedure similar to that custo- 
marily used in constructing attitude scales. In constructing this 
scale I made sure that the scale value for each item was the same for 
each science for which it was designed. Therefore, it would appear that 
the scale should be valid for differential measurement of a student’s 
interests in various science subjects. To see if the scales meet this 
expectation I have been analyzing some data collected on students. 
These data consist of scores on the scale for each science which the 
student had taken at the time the measurements were made and an 
indication of the student’s major subject. On the assumption that the 
student’s measure of interest should be highest for that subject in which 
he is majoring, the interest measure for the major subject should rank 
first (or high) among the other measures. 

In the case of the students who have taken two science subjects 
including the major, therefore, I have determined the number of 
students for whom the score on the major subject ranks first and the 
number for whom it ranks second. Since on the basis of chance one 
would expect the major subject to rank first half the time and second 
half the time, it is possible to test the obtained distribution of fre- 
quencies against an equal probability hypothesis by the chi-squared 


test. Further since I am interested in the distribution in only one 


direction, the appropriate P value is one-half the table P value for the 
obtained chi-squared. 

In the case of 27 students who have taken three subjects, it is ob- 
served that in the case of 18 students the score for the major subject 
ranks first; for 5 students the score for the major ranks second; and for 4 
students it ranks third. As before I have tested this observed distribu- 
tion against an equal probability hypothesis. A chi-squared value can 
be computed with two degrees of freedom. It is in connection with 
the determination of the P value for this chi-squared that my problem 
arises. 

With an equal probability hypothesis, and with the observed fre- 


g 
| 


QUERIES 115 


quencies of 18,°5 and 4, the obtained chi-squared value is a unique 
value. However, there are six possible arrangements of the numbers 
18, 5 and 4 which would yield this chi-squared value. As I understand 
chi-squared, the P value obtained would cover the probability of getting 
all six of the possible arrangements of the observed frequencies. But 
it is only one of these that I am interested in, namely the one which is 
in line with the hypothesis I am testing. 

With these facts in mind it har been suggested that since I am in- 
terested in only one particular distribution of the six possible distribu- 
tions and since the chi-squared value for the 27 people is unique, it 
might be possible to determine the probability of this particular dis- 
tribution by dividing the tabled P value by six. If the suggested solu- 
tion is not satisfactory, I would appreciate any other suggestions which 
you might be able to offer me. 


For chi-square with 2 degrees of freedom, such as the one 
ANSWER: you describe, the appropriate probability is that read 

directly from the table. The same is true for the chi- 
square you describe with a single degree of freedom; I do not see the 
reason for using half the tabular P. 

The model on which these chi-square tests are based is the multi- 
nomial distribution (trinomial and binomial, respectively). One as- 
sumption made is that there is a common probability associated with 
all the individuals found in any one cell of the distribution. It seems 
questionable that this assumption is realistic in your investigation. I 
think it unlikely that the 18 students whose score for the major subject. 
ranked first arrived at this ranking all with the same probability. There 
must be a good deal of what Cochran called extraneous variation, 
(Jour. Am. Stat., Assoc. 38:287-301, 1943). 

This. question is raised because there are other models which may 
be more suitable. The first that presents itself is that of the two-factor 
experiment (otherwise known as randomized blocks) with a single score 
in each cell. One way to describe this model is that it is based on ran- 
dom sampling from normal distributions with common o” together with 
additive main effects. You may have constructed your scales, as many 
do, to yield near-normal distributions. As to the randomness of samp- 
ling, in the sense that there are no correlations among the normal 
deviates composing o’, I have no means of judging: one might suspect 
such correlations among scores turned in by students of a common in- 
structor who is either popular or detested. I see no reason to question 
either a common population variance or the additive feature in the main 


fe 


116 BIOMETRICS, MARCH 1951 


effects. If the model is applicable, the appropriate analysis of variance 
of the 81 scores is: 


Science Subjects 2 degrees of freedom 
Students 26 
Error 52 
Total 80 


If the rating scales produce distributions which are far from normal, 
you might consider Brown’s distribution-free test described by Mood 
in “Theory of Statistics,” page 399. 


. 


ABSTRACTS 


THE BIOMETRIC SOCIETY—BRITISH REGION 


Abstracts of papers for meeting on 
Thursday, 14 December 1950 


139 H. O. HARTLEY. Double balancing of incomplete block de- 
signs. 


The well-known field experiment known as ‘Randomised Blocks’ 


consists of b Blocks, each containing k experimental plots which are 
allotted to the k treatments. With the ‘Incomplete Block Design’ in- 
troduced by Yates (1936) the number of treatments (v) to be compared 
exceeds the number of plots (k) available in each block and a special 
balanced arrangement (developed by Fisher, Yates, Cox, Bose and 
others) is required to obtain comparisons of equal precision for all 
treatment-pairs. Such an arrangement is shown below for b = 7,k = 4 
and v = 7. 


Block (Plant No.) 
1 2 3 4 5 6 7 
Plot 1 3 4 5 6 7 1 2 
(Position 2 5 6 z 1 2 3 4 
of leaf) 3 6 7 1 2 3 4 5 
4 7 1 2 3 4 5 6 


The design is particularly useful with experiments (usually not field 
experiments) in which there are distinct ‘Blocks’ each containing a 
strictly limited number of experimental units. Youden (1937) used the 
design in comparing the lesions produced on leaves by the tobacco- 
mosaic virus applied in v different solutions. His ‘Blocks’ were repre- 
sented by b plants with leaves playing the part of ‘plots’. In order to 
control any possible effect on the lesions of the position of the leaf down 
the stem, Youden rearranged the treatments for each plant in such a 
way that each treatment was applied exactly once to a top leaf, once 
to a leaf in the 2nd level etc. as shown above. This double balance is 


117 


‘ 
: 
ay 
| 


118 BIOMETRICS, MARCH 1951 


always possible when the number of blocks (plants) equals the number 
of treatments (Both = 7 above). Of the 63 Incomplete Block Designs 
listed in Fisher & Yates’ Tables, 28 are of this kind and are tabulated 
in the Double balanced form, i.e. as Youdén Squares in Cochran & 
Cox’s book (1950). The present paper gives a similar method of double 
balancing for the remaining 35 designs. In 23 of these it will be neces- 
sary to split the treatments into groups such that any two treatments 
in the same group can be compared with equal precision, and treat- 
ments in different groups with a slightly lower precision. Experiments 
in which these designs are useful will be discussed. 


140 JOAN MAY & A. LUBIN. Eliminating superfluous variables in 
an analysis of variance with disproportionate frequencies. 


In ordinary analyses of variance, the effect of each level of a variable 
is estimated by the mean of all sub-cells at that level. This allows the 
total deviance to be analysed into components due to each variable, 
their interactions, and an error term, all of which are mutually linearly 
uncorrelated. When the sub-cell frequencies are not proportional to the 
marginal totals, using the means as estimators results in the deviance 
components being correlated with one another. In this case, the pro- 
cedure usually recommended is to fit the estimators by least-square 
methods as an application of multiple regression. 

The labour involved in fitting constants by least-squares increases 
out of proportion as the number of variables increases. It is suggested 
that in any p-way analysis of variance, a set of k variables (k < p) 
should be sougit such that the deviance accounted for by the k variables 
does not differ significantly from the deviance accounted for by the p 
variables. In this way we will eliminate a set of (p — k) variables, 
whose means may or may not vary significantly but have no unique 
contribution to make to the deviance accounted for by the k variables. 
Two methods of finding such a set of k variables will be described, an 
exhaustive method, which tests all possible combinations of the p 
variables; and an approximate method, which tests only certain combi- 
nations of the p variables. 


14] A. F. PARKER-RHODES. Information from measurement of 
tetraspores. 


If a diploid plant producing tetraspores is heterozygous for any pair 
of alleles affecting any measurable character of the spores, (other than 


at 
| 
ely im 
4 
1 ° 


ABSTRACTS 119 


by maternal effect), the latter should fall into two classes whose exist- 
ence could be determined by a statistical analysis of a large sample of 
measurements. In practice it is to be expected that the genes detectable 
in this way will be numerous and well-scattered among the chromo- 
somes, so that the amount of heterozygosity appearing in the tetra- 
spores, if it could be measured, would be a consistent estimate of that 
of the individual as a whole. 

We assume that in the absence of such segregation, the distribution 
of the character measured would be normal or simply-skew; the actual 
distribution is not of this kind; the problem is to estimate the number 
and nature of the normal or simply-skew components of which the 
given distribution: is compounded. If the number of components is 
finite efficient statistics for this purpose exist, but are fantastically 
laborious to compute; statistics based on moment functions, though the 
efficiency may be as low as 1/4, are relatively easy to calculate. By 
their means we can detect up to four components and rigorously com- 
pare two individuals as to the residual heterozygosity not thus ac- 
counted for. Subject to certain reservations, a single individual may 
for this purpose be taken as a fair sample of a population. 

The possibility of the nucleus of a basidiospore affecting the visible 
characters of the spore having been disputed, the method was applied 
to determine this question. In the agaric Psilocybe dispersa it was 
found that the distribution of spore breadths was incompatible with 
the view that there were no filial effects; it was concluded that the 
method is applicable to agarics, at least of some species. 

The method has been used extensively in trying to ascertain whether 
the agarics in the island of Skokholm (Wales) are subject to genetic 
isolation through their geographic situation; if they were so, it would be 
expected that their populations would be more homozygous on the 
island, than those of the same species on the mainland. Evidence that 
such isolation obtains was found in each species examined, being most 
complete in the case of Panaeolus papilionaceus. 

No direct demonstration has yet been made of the occurrence of 
natural hybrids among the higher fungi; an F, hybrid would be ex- 
pected to show unusually wide segregation in tetraspore characters, and 
ought thus to be detectable by these methods. Certain specimens 
collected on Skokholm were suspected of being hybrid and examined; 
the result was unexpectedly to reveal the existence of a new species, 
itself presumably of hybrid origin, as well as hybrids between this 
species and its supposed parents. 

The:method has excellent prospects of wider application. 


: 
| 
| 
> 


120 BIOMETRICS, MARCH 1951 


M. R. SAMPFORD. The analysis of data from time-mortality 
142 trials 


When a group if individuals is treated with a poison, it is frequently 
found that the distribution of survival times is either normal, or can 
be made normal by a simple transformation (usually logarithmic) of 
the time-scale. The estimation of the parameters of the distribution, 
however, is frequently complicated by the presence of one or more 
‘interference effects’. These effects may be classified into two general 
types. 

(a) Presence of an alternative cause of death. 

Deaths may occur among the members of the group from causes 
other than the poison. These deaths may be classed as accidental (due 
to such causes as post-operative shock, illness, and damage during 
routine inspection) or natural. The natural death rate becomes im- 
portant in experiments whose duration is an appreciable proportion 
of the expected life of the individual (for example, in investigation of 
the toxic action of atomic radiation). 

(b) Failure to obtain 100% response. 

It is sometimes necessary to suspend observation before an experi- 
ment is complete, or to perform interim calculations on results from an 
experiment still in progress. The data obtained are then sharply trun- 
cated at the upper end. Methods for dealing with such data have been 
discussed by Bliss. In some experiments, a number of individuals 
survive the treatment; this situation arises, for example, in the analysis 
of data from quantal response assays, if an attempt is made to obtain 
additional information from the times to death of animals killed. Two 
cases may be distinguished; that in which individuals survive because 


they are immune to the poison, and that in which they recover from 


its effects, the two situations giving rise to different mathematical 
models. 


Maximum likelihood methods for these problems are considered, and 
a number of practical techniques are described. 


C. C. SPICER. Some points in the statistics of counting bac- 


Most of the statistical theory that has been used in counting bacteria 
is concerned with a method in which only the presence or absence of 
bacteria in a sample is noted. This extremely inaccurate procedure is 


' 
4 


ABSTRACTS 121 


only used when no other is available, and the more usual techniques in 
which counts are made also present some points of statistical interest 
which are considered in this paper. The topics dealt with include the 
problem of combining results from several dilutions, the estimation of 
dilution errors, and the effect of clumping of bacteria on the estimate 
of total numbers. 


Abstracts of papers for meeting on Tuesday, 13 March 1951 


144 J.G.SKELLAM. Phylogeny as a Stochastic Process. 


It is known for a wide variety of groups of living organism that the 
number of species in a genus has a logarithmic distribution. 

In order to provide an explanation, the evolutionary tree is regarded 
as the outcome of a stochastic process in which at intervals each line of 
descent is liable 


(1) to remain unbranched 
(2) to branch into two 
(3) to disappear from the genus to which it belongs either by 
(a) becoming extinct 
(b) becoming the starting point of a new genus (i.e., changing its 
generic name). 


By the method of generating functions we obtain the equation 
$,.,(u) = ®,(au + Bu’) 


where ®, is the factorial moment generating function of the number of 
species in a genus which originated exactly n years ago. 

The solution of this equation leads to a geometric distribution with 
an abnormal zero-th class. The logarithmic distribution soon follows by 
integrating over n. 


R. E. BLACKITH. Biological Assays with Resistant Strains of 
145 Insects. 


Two distinct problems face the experimenter. In one case a strain, 
already resistant, has to be compared with non-resistant stocks. In 
others the experimenter can choose his material and create a resistant 
strain by selection. Both cases are illustrated from problems currently 


ds 
: 


122 BIOMETRICS, MARCH 1951 


being investigated, using respectively mercury vapour and the pyrethrins 
against the grain weevil Calandra granaria L. 

The use of some properties of the dosage-response regression curves 
to suggest the mechanism of resistance is indicated, as are some of the 
biometrical problems encountered. 


146 J. M. TANNER. A Discriminant Function for Androgyny. 


Between men and women there are various differences in body build 
apart from those of the external genitalia and breasts. These other 
differences are not qualitative like those of the reproductive organs, but 
quantitative, and the distributions of the characters concerned overlap 
for the two sexes. Thus, in these characters a man may approach the 
feminine build either more or less, and this component of physique is 
known as androgyny. Various measures of the degree of androgyny have 
been used in endocrinological studies, and studies of growth and develop- 
ment; perhaps the best known is the hip width shoulder width ratio. 
The present proposal is to combine two or more anthropometrical meas- 
urements using weights given by maximum discrimination between men 
and women. On the basis of two measurements the best straight line 
discriminant between 237 men and 172 women students is 


3 shoulder width — hip width — 82 = 0 
and on the basis of three measurements 
2 shoulder width + 0.53 leg length — 1.25 hip width — 81 = 0. 


The first line misclassifies 14 men and 34 women, the second 10 men 
and 24 women; the hip width/shoulder width ratio misclassifies 42 men 


‘and 46 women. Androgyny is measured by the distance from the dis- 


criminant line, a given person’s masculinity score being 3 shoulder width 
— hip width. These scores are approximately normally distributed for 
persons of each sex. An example of the use of the scale in medical work 
will be given. 

The following questions will be put up for discussion: 


(a) Am I justified in using a discriminant function in this way? If 
so, why? What sort of scale does the procedure lead to and can 
one apply usual statistical procedures to it? 

(b) Suppose no women existed; how would one go about measuring 
androgyny in men? 

(c) Dysplasia is the name given to differences or disharmonies in 


| 
4 


ABSTRACTS 123 


body build existing between different bodily regions in the same 
person (narrow chest plus stocky heavily-muscled legs, etc.) 
People differ in their degree of dysplasia. How should one meas- 
ure it? 


147 W. D. M. PATON & W. L. M. PERRY. The Analysis of Elec- 
trical Records from a Sympathetic Ganglion in the Cat. 


It is frequently necessary to study the effects of drugs which act on 
the nervous system by measuring the changes which they produce in the 
electrical responses of such tissues to stimulation. The shape of the 
complex electrical record which can be recorded from the cat’s superior 
cervical ganglion is greatly altered by the action of many drugs, e.g. 
nicotine. The question which arises is how to express this change in 
quantitative terms. 

On the basis of what is already known, there are good grounds for 
assuming that the latter part of the electrical complex at least is the sum 
of two exponentially decaying processes of opposite sign and with differ- 
ing time constants and initial magnitudes. It is impossible to say from 
simple inspection of the complex after the action of a given drug which of 
these parameters have been affected. For instance, a shortening of the 
time constant of one of the processes may produce a result easily confused 
with that resulting from a reduction in initial magnitude of the same pro- 
cess. 

We have therefore made an attempt at some sort of quantitative 
analysis of these records. In the normal record, the time constants of 
the two processes are sufficiently dissimilar to allow the direct measure- 
ment, by a simple logarithmic plot, of the time constant of the slower 
positive wave, and then, by subtraction, of that of the quicker negative 
wave. Further, by extrapolation back to zero time, estimates can be 
obtained of the initial magnitudes of the two waves; these are also neces- 
sary in interpreting the physiological significance of such records. 

After the action of certain drugs, however, particularly nicotine, the 
time constant of the negative wave is so much reduced that by this simple 
graphical method it is difficult to determine it, and impossible to obtain 
a reliable estimate of the initial magnitude of the negative process. It is 
for this reason that we have been led to present this paper. Can anyone 
suggest a better means, graphical or otherwise, of fitting these observed 
complexes by the sum of two exponential curves? The question is an 
important one, and its solution might shed a good deal of light on the 
physiology and pharmacology of the nervous system. 


= 

: 
_ 


124 BIOMETRICS, MARCH 1951 


ABSTRACTS OF CONTRIBUTED PAPERS PRESENTED AT A 

JOINT MEETING OF THE BIOMETRIC SOCIETY (ENAR) 

AND THE INSTITUTE OF MATHEMATICAL STATISTICS AT 
OAK RIDGE, TENNESSEE, MARCH 15-17, 1951 


A. E. BRANDT (U.S. AEC, New York). Forms of Analysis for 
148 Either Measurement or Enumeration Data Amenable to Machine 
Methods. 


A matrix-vector product method and a step of factorial method for 
reducing data by the analysis of variance or covariance or by Chi Square, 
according as the data are measurements or counts, are presented. 
Though these methods are most valuable for use with modern computing 
machines, they are useful in presenting the concept of degrees of freedom 
and, in some cases, for hand calculations. 


149 W. J. YOUDEN (National Bureau of Standards). Linked 
Blocks: A New Class of Incomplete Block Designs. 


Available balanced incomplete block designs require several replica- 
tions. This is not a serious objection in field trials because the large 
standard error in such work often makes it desirable to have a fair num- 
ber of replications. The various lattice arrangements that are available 
reduce the number of replications at the price of losing symmetry and 
introducing a considerable complexity in the analysis of the data. 
Physical and chemical measurements are comparatively high in pre- 
cision and are commonly made in duplicate or triplicate. A series of 
designs has been constructed which possesses symmetry, ease of analysis, 
reasonable block size and a small number of replications. Examples of 
the new designs are 


size of number of | number of | number of 
block blocks treatments | replications 

4 5 10 2 

5 6 15 2 

4 9 12 3 

6 7 14 3 

6 13 26 3 

7 15 35 3 


j 
bil 
4 
| 
| 
x 
ek 
£ 
= 
‘ 


ABSTRACTS 125 


150 CHURCHILL EISENHART (National Bureau of Standards). 
On the Statistical Analysis of Linked-Block Experiments. 


Let {x,:;} denote the observations yielded by an experiment involv- 
ing r measurements on each of ¢ treatments (or varieties) arranged in b 
blocks of size k in accordance with a Youden linked-block design, and let 


where the constant » denotes the general level of performance, the 
constants 7; , with }>/., 7; = 0, denote the “treatment effects”, the 8; , 
with 8; = 0, denote the “block effects”, the p, , with p, = 0, 
denote the “replicate effects’ (1 < r’ < r), and the “errors” «€,;; are 
independent normally distributed random variables with zero means and 
common variance o”. By application of the method of least squares the 
“best”? unbiased estimators of u, o”, and the 7’s, 6’s and p’s are derived; 
and the analysis-of-variance tests (F-tests) for the existence of treatment 
effects (i.e. 7; ¥ 0 for at least one, and therefore for at least two values 
of j), block effects, and replicate effects are deduced and their power 
functions discussed. Also t-tests are derived for testing the significance 
of the difference between the effects of treatments j, and j, according as 
they do, or do not, occur together in some block. 


R. A. BRADLEY and M. E. TERRY (Virginia Polytechnic 
151 Institute). Rank Analysis of Incomplete Block Designs I (Pre- 
liminary Report). 


True preferences or ratings, p, , --- , P: are assumed to exist for ¢ 
treatments in balanced incomplete blocks of two, >> p; = 1. When 
treatments 7 and 7 appear together, the probability that z is “‘better’’ 
than 7 is taken to be p;/(p; + p;). 

The likelihood ratio test criterion is used to test the hypothesis that 
p; = 1/t for all ¢ against the alternatives (i) that the p’s are not all equal 
and (ii) that two exhaustive groups of treatments have equal ratings but 
the groups themselves differ. Alternative hypotheses that involve a 
subset of treatments produce tests dependent on nuisance parameters. 
The likelihood ratio criterion produces tests which differ considerably 
from Kendall’s coefficient of agreement for paired comparisons. Tables 
and exact distributions are in preparation. The extension and generaliza- 
tion of the theory to blocks of more than two will be presented subse- 
quently. 


{ 
| 
: 
x 
: 
5 
> 
‘ 


126 BIOMETRICS, MARCH 1951 


152 A. FE. BRANDT (U.S. AEC, New York). Some Notes on Some 
Growth Functions. 


The details of methods of reducing growth data of chicks during the 
period for 1 to 12 weeks of age to two or three constants for comparing 
groups are presented. Exponential functions of the types y = ae’* and 
y = ae’* + careused. In general, the constants are evaluated by fitting 


linearized or rectified forms of these functions to experimental data by 
Least Squares. 


M. L. CLARK and F. X. LYNCH (Armed Forces Institute of 

153 Pathology). Clinical Symptoms of Radiation Sickness, Time to 
Onset, and Duration of Symptoms among Hiroshima Survivors 
in the Lethal and Median Lethal Ranges of Radiation. 


Comparison is made of two groups of survivors of the atomic bomb 
explosion in Hiroshima, Japan, who were out of doors and apparently 
unshielded within ranges of initial nuclear radiation defined as lethal and 
median lethal dosages. Radiation sickness developed in all these sur- 
vivors. The clinical signs and symptoms, the time from bombing to 
onset, and duration of the symptoms are compared for the two groups. 


i] 
5 
: 
ii 
= 
i 
i 
3 


THE BIOMETRIC SOCIETY 


Following an inquiry by the Secretariat of the Economic and Social 
Council of the U. N. concerning activities and possible collaboration, the 
Biometric Society expressed its interest in the [Sub-Commission on 
Statistical Sampling and in the program of training of statisticians of 
ECOSOC. In consequence, the Society has been placed on the register 
of non-governmental organizations maintained by the Secretary-General 
for the purpose of consultation when this would seem appropriate. 

As of mid-December, 103 replies had been received to the question- 
naire concerning the teaching of biometry distributed last February by 
the Committee of the Society and of the Section of Biometry of the 
IUBS. The most frequently expressed desire was for information about 
the contents of courses in biometry, and this will probably be first on 
the agenda of the Committee. It is hoped that other requests can also 
be considered. As information is developed, it will be distributed to 
those who have expressed their interests in the activities of the Com- 
mittee by returning the questionnaire. 

A limited number of copies of the Proceedings of the 1949 Geneva 
Conference which appeared in the March, June and September 1950 
issues of BIOMETRICS are available. This material has been as- 
sembled under one cover and may be obtained from the Secretary’s 
office _. $4.00 per copy. 

The following Regional officers have been elected for 1951: 


Eastern North American Region 


Vice President H. W. Norton 

Secretary-Treasurer W. T. Federer 
Western North American Region 

Vice President G. A. Baker 

Secretary-Treasurer W. C. Rollins 
Australasian Region 

Vice President C. W. Emmens 

Secretary-Treasurer J. A. Keats 


The general officers of the Society for 1951 have been reduced in 
number on recommendation of the Special Finance Committee by com- 
bining the posts of Secretary and Treasurer. The council has reelected 
Arthur Linder as president and named C. I. Bliss as Secretary-Treasurer. 
The following six members were elected in our annual mail ballot to 


127 


' 
. 
— 
: 
j 


128 BIOMETRICS, MARCH 1951 


serve on the Council for the term 1951-1953: D. J. Finney, J. W. 
Hopkins, N. K. Jerne, P. C. Mahalanobis, K. Mather and Margaret 
Merrell. 


JOINT MEETING: ENAR and AAAS—Cleveland, Ohio—De- 
cember 27-29, 1950. On December 27-29, the Eastern North American 
Region held a joint meeting with the American Association for the 
Advancement of Science in Cleveland, Ohio. The six sessions arranged 
under the able direction of Professor N. Rashevsky were devoted to a 
symposium on mathematical biology and biometry. The first session, 
with N. Rashevsky presiding, featured papers by H. Branson and J. Z. 
Hearon. The second session, directed by Leslie Nims, featured papers 
by K. Menger, A. Rapoport and A. Shimbel, and P. F. Lazarsfeld. 
The third and fourth sessions were held on December 28, with Karl 
Menger presiding at the former and papers by N. Rashevsky and H. G. 
Landau, and with Herman Branson presiding at the latter and papers 
by L. Nims and B. Harshbarger. The fifth session was held on the 
morning of December 29 under the chairmanship of A. Rapoport with 
papers by H. D. Landahl and I. Opatowski. The last session, on the 
afternoon of December 29, was under the chairmanship of B. Harsh- 
barger, and featured papers by I. Opatowski and G. Sacher. 

In addition to the above sessions, a two-session symposium on “The 
Structure and Analysis of Plant Communities” contained a considerable 
amount of biostatistics and biometry and was scheduled on December 
30 by the A.A.A.S., Section G, and the Ecological Society of America. 


ANNUAL MEETING: EASTERN NORTH AMERICAN RE- 
GION, Chicago, Ill., 1950. The Eastern North American Region held 
its annual meeting in Chicago on December 27-29 jointly with the 
Biometrics Section of the American Statistical Association and the In- 
stitute of Mathematical Statistics. The business meeting was held 
December 29. The new members of the Regional Committee for the 
period 1951 to 1953 are Donald Mainland and H. L. Lucas. The 
scientific program consisted of ten sessions. The papers presented at 
the first eight sessions were listed in BIOMETRICS last September. 
Two sessions of eight contributed papers on Friday afternoon were under 
the chairmanship of Evelyn Fix and Robert Gage. The eight papers 
were: Calculations of median lethal doses when doses are subject to a 
particular error, Clifford J. Maloney; Bio-assay for people who do not 
enjoy computation, I. J. Bross; The use of the Poisson series in the 
evaluation of media for the isolation of cholera vibrios, Oscar Felsen- 
feld; Evaluation of diagnostic tests which yield no false positives, Nathan 


3 
i 
4 
| 


THE BIOMETRIC SOCIETY 129 


Mantel; Sequential procedure for grading milk by microscopic counts, 
M. E. Morgan, P. MacLeod, C. I. Bliss and E. O. Anderson; A study 
of variation with incomplete blocks in taste testing, Lyle D. Calvin; 
Two way analysis of variance with missing data and unequal cell fre- 
quencies, F. E. Satterthwaite; A comparison of variance components in 
corn yield trials, Walter T. Federer. 


REGION FRANCAISE. Séances tenues le 5 Décembre et le 12 Dé- 
cembre 4 |’Ecole Normale Supérieure, Paris. A la premiére séance, 
Monsieur A. Vessereau discuta du plan expérimental et de son application 
en agronomie. Deuxiéme séance: Esquisse d’un panorama de la Re- 
cherche agronomique en France vue sous |’angle des applications de la 
Statistique. 

Le 7 Mars, réunion au méme endroit. La séance de travail porta sur 
V’étude quantitative de la croissance, et plus particulitrement de la 
croissance relative. Les exposés furent donnés par G. Teissier (exposé 
d’ensemble) et par D. Schwartz (cas des végétaux). 


bes 
Ab 
Ges 
: 


| 


NEWS AND NOTES 


Our President, Arthur Linder, has been invited by the International 
Statistical Institute to teach at the International Statistical Education 
Center in Calcutta, India from July to December 1951. He and his 
family will spend some six months in India. 


Virginia Polytechnic Institute 
Statistics Summer Session 1951 


The Department of Statistics, Virginia Polytechnic Institute, will 
hold a special summer session August 8-25, 1951. It will be for graduate 
students, research workers, and technicians in government and industry. 
Special emphasis will be given to statistics in economics and engineering. 
Several visiting professors will participate in the lecturing. For details 
write the Department of Statistics, Virginia Polytechnic Institute, 
Blacksburg, Virginia. 


Summer Sessions at the Statistical Laboratory, University of California, 
Berkeley, California 
1st Session June 18th through July 28th 
2nd Session July 30th through September 8th. 


This year’s summer program at the Statistical Laboratory of the 
University of California includes four of the usual undergraduate courses, 
two in each session, and two graduate courses. One of the latter is a 
regular course of lectures on rank correlation methods and on time series 
analysis. The other graduate course is a seminar on time series and re- 
lated problems. Both graduate courses will be given during the first 
Summer Session by Professor Maurice G. Kendall of the London School 
of Economics and Political Science. Professor J. Neyman will be avail- 
able for consultations on work leading to higher degrees. In addition to 
the above two persons, the faculty of the Summer Session will include 
Dr. Grace E. Bates (Mount Holyoke College), Dr. Colin R. Blyth (Univer- 
sity of Illinois) and Dr. Gottfried E. Noether (New York University). 


Institute of Statistics of The University of North Carolina 
‘Special Summer Session 
The Institute of Statistics of the University of North Carolina is 
offering another summer session in applied and mathematical statistics, 
June 11 to July 18, 1951. This session is for research scholars in other 
sciences who want a practical working knowledge of statistical theory as 


130 


4 
| 
| 
me 
lage 
q 
ij 
| 
i { ° 


NEWS AND NOTES 131 


well as for consultants, teachers and students in statistics. The teachers 
are G. W. Snedecor, W. J. Youden, R. L. Anderson, R. C. Bose, Gertrude 
M. Cox, A. L. Finkner, 8. N. Roy, R. J. Monroe and H. Fairfield Smith. 
For details write Mrs. Sarah Carroll, Institute of Statistics, State College, 
Raleigh, N. C 


Second Annual Session of the Summer Seminar in Statistics 
University of Connecticut, Storrs, Connecticut 
August 6-31, 1951 


With the informal cooperation of several eastern U. 8. universities, a 
summer seminar in statistics will be held at Storrs, Connecticut, August 
6-31, 1951. As in the case of the 1950 summer session, the purpose is to 
stimulate general exchange of ideas and to bring graduate students (as 
well as professional statisticians) into contact with recent theory and 
applications. It is hoped that competitive scholarships to cover living 
expenses will be available. Part of the organizers and the schedule of the 
main sessions are as follows: 


Aug. 6-10. Statistics in the biological sciences, C. I. Bliss. 
Aug. 13-17. Time series and econometrics, M. G. Kendall, J. W. Tukey. 
Aug. 20-24. Statistical theory probability, M. Kac. 


Aug. 27-31. Statistical techniques with special reference to the social 
sciences, Frederick Mosteller, F. L. Strodtbeck, Max A. Woodbury. 
The main sessions will be held in the period 3:00 p.m. to supper. 


Special sessions (e.g., in preparation for some main sessions) will probably 
be held. 

The Summer Seminar in Statistics is sponsored by the University of 
Connecticut. There will be ample recreational facilities. Dormitory 
accommodations consist of single and double rooms. A family group of 
three or more should use non-dormitory housing. 

For further details about these programs write the planners. For 
general information write the Executive Committee’s Secretary: D. F. 
Votaw, Jr., Department of Mathematics, Yale University, New Haven, 
Connecticut, 


| 


Extra copies of this issue can be secured for $2.00. 


Also copies of Volume 3, Number 1 on Analysis of Variance can be 
obtained for $1.50. 


Please mail requests to the Office of the Editor, Instiiute of Statistics, 
North Carolina State College, Raleigh, N.C. 


j 
! 
ay 
4 
4 
4 
yt 
4 
4 
5] 
q 
| 
if 
4 


} j 
pee 
A 


